Machine Learning - Unit 10: Natural Language Processing

Overview

Natural Language Processing (NLP) is one of the most rapidly evolving fields within machine learning, enabling AI-driven text processing, language understanding, and conversation systems. The evolution of Transformer-based architectures such as BERT, GPT, and T5 has significantly improved machine understanding of human language. NLP advancements have led to breakthroughs in chatbots, machine translation, information retrieval, and summarisation.

My Reflection

This unit was titled "Natural Language Processing", but it had no materials covering NLP at all! It only had couple of reading pieces and a seminar, all of which were about CNNs! This type of mistakes is very disappointing, especially that it is not the first time that something like this happened in the programme. Accumulatively, these mistakes are making me less engaged over time and unsure whether people running the overall program actually care about the students experience.

Anyhow, one of the reading pieces was a very impressive intractive visual explainer on CNNs, which I kept getting back to in addition to the additional useful Resources that I included in the previous units.

Artefacts - Collaborative Discussion 2: Legal and Ethical Views on ANN Applications

Summary Post - AI Writers: Augmentation, Risk, and the Need for Domain-Specific Oversight

In my initial post, I explored the dual nature of AI writers, drawing on Hutson’s (2021) framing of models like GPT-3 as “mouths without brains”—tools capable of generating fluent text without true understanding. I highlighted their benefits in administrative and creative contexts, particularly in enhancing productivity and sparking collaboration. However, I also emphasized the risks: factual inaccuracies, embedded biases, and ethical concerns, especially in high-stakes domains like healthcare and law. I argued for human-centric design and robust evaluation frameworks aligned with Industry 5.0 principles (Metcalf, 2024), positioning AI as a collaborator rather than a replacement.

Adil’s response deepened this analysis by expanding on the “stochastic parrot” metaphor (Bender et al., 2021), noting that AI systems not only echo patterns but can amplify biases through feedback loops (Weidinger et al., 2021). He raised concerns about epistemic risks in creative and scientific writing, where the shift from creation to evaluation demands high domain expertise. The “competence without comprehension” problem (Marcus and Davis, 2020) underscores the danger of plausible-sounding misinformation, especially when users lack the skills to detect subtle errors.

Both posts converge on the need for domain-specific validation protocols and ethical safeguards. As AI tools become more embedded in professional and creative workflows, their governance must evolve to ensure accuracy, fairness, and transparency. The future of AI writing lies not in automation alone, but in thoughtful augmentation guided by human judgment.

---

References

Bender, E.M. et al. (2021) ‘On the dangers of stochastic parrots: Can language models be too big?’, Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp. 610–623. doi:10.1145/3442188.3445922.

Hutson, M. (2021) ‘Robo-writers: the rise and risks of language-generating AI’, Nature, 591, pp. 22–25. doi:10.1038/d41586-021-00530-0.

Marcus, G. and Davis, E. (2020) ‘GPT-3, Bloviator: OpenAI’s language generator has no idea what it’s talking about’, MIT Technology Review, 22.

Metcalf, G.S. (2024) ‘An Introduction to Industry 5.0: History, Foundations, and Futures’, in Nousala, S., Metcalf, G. and Ing, D. (eds) Industry 4.0 to Industry 5.0. Singapore: Springer Nature Singapore.

Weidinger, L. et al. (2021) ‘Ethical and social risks of harm from language models’, arXiv preprint, arXiv:2112.04359.