Machine Learning

Machine Learning - Unit 8: Training an Artificial Neural Network

Overview

Last week, we learned about the structure of human neuron and how that influenced the development of artificial neural network (ANN), its structure and how it works. We will continue with the example of ANN discussed last week.

My Reflection

This week was a bit dense, as it had a lot of materials to cover, in addition to my parallel learning for how to understand and implement a CNN model as required for the individual assignment.

The unit covered training of ANNs, especially the concepts of backpropagation and updating weights, which were covered by the unit's lecturecast, as well asone of the readings. In ANNs, updating weights through gradient descent and backpropagation is the core mechanism for learning. During training, the network computes a loss that quantifies prediction error. Backpropagation then calculates the gradient of this loss with respect to each weight by propagating errors backward from the output layer to earlier layers. Gradient descent uses these gradients to adjust the weights in the direction that minimises the loss—typically by subtracting a scaled version of the gradient, controlled by a learning rate. This iterative process enables the network to fine-tune its parameters, gradually improving accuracy and generalization across training epochs.

Another reading also introduced the debate on using generative aritificial intelligence (AI) for writing, and whether what AI writes is equivalent to the writing of humans. This was an essential reading for the second collaborative discussion, which started in this unit. In the artefacts below, I included my initial post that I contributed with to the discussion.

As a final note, the unit also included a seminar, in which we were presented with the basics of ANNs and their applications. And in parallel, I continued my learning on DataCamp, and viewed several other sources that were helpful to me in understanding deep learning and CNNs. I include one of the examples below in the artefacts section as well.

Artefacts - Collaborative Discussion 2: Legal and Ethical Views on ANN Applications

Initial Post - 'Robo writers' are tools, and tools augment not replace

The rise of AI language models like GPT-3, as discussed by Hutson (2021), marks a transformative moment in how text is generated across domains, from administrative tasks to creative writing. These models, trained on vast corpora and powered by billions of parameters, offer unprecedented fluency and versatility. Yet, their lack of semantic understanding and ethical reasoning presents significant risks.

Administrative Applications

In administrative contexts, AI writers can streamline operations: summarising legal documents, drafting emails, and automating customer service (Brown et al., 2020). This enhances efficiency and reduces cognitive load. However, as Hutson notes, these systems are “mouths without brains”; they generate plausible text without grasping meaning. This poses risks in high-stakes environments like healthcare or law, where misinterpretation can lead to harm (McGuffie and Newhouse, 2020).

Creative Domains

In creative domains, AI can inspire new forms of collaboration. As the article by Hutson (2021) puts it, GPT-3 [and onwards]’s poetic outputs are often “worth editing”. Yet, its tendency to reproduce biases and stereotypes embedded in training data raises concerns about cultural representation and ethical authorship (Bender et al., 2021). The “stochastic parrot” metaphor aptly captures this: AI echoes patterns without critical reflection.

Machine Learning Perspective

From a machine learning perspective, the challenge lies in balancing scale with control. While few-shot learning showcases recent GPT models’ adaptability, its unpredictability underlines the need for robust evaluation frameworks and human oversight (Raffel et al., 2020). As we apply these models to real-world problems, especially under uncertainty, we must critically appraise not just their outputs but their epistemic foundations.

Conclusion

Ultimately, AI writers are tools—not authors. Their integration into workflows should prioritise human-centric design, ethical safeguards, and contextual awareness, echoing the principles of Industry 5.0. The goal is not to replace human creativity or judgment, but to augment it responsibly.

---

References

Bender, E. et al. (2021) ‘On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?’, FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 610–623. Available here.
Brown, T. et al. (2020) ‘Language Models Are Few-Shot Learners’, ArXiv (Cornell University) [Preprint]. Available here.
Hutson, M. (2021) ‘Robo-writers: the Rise and Risks of Language-generating AI’, Nature, 591(7848), pp. 22–25. Available here.
McGuffie, K. and Newhouse, A. (2020) ‘The Radicalization Risks of GPT-3 and Advanced Neural Language Models’, arXiv (Cornell University) [Preprint]. Available here.
Raffel, C. et al. (2019) ‘Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer’, ArXiv (Cornell University) [Preprint]. Available here.

Artefacts: Useful Additional Resource on CNNs

MIT Introduction to Deep Learning 6.S191: Lecture 3 Convolutional Neural Networks for Computer Vision

This video helped me gain better understanding of CNNs, which is the main deep learning architecture that I need to understand and implement for the individual assignment.