Machine Learning - Unit 2: Exploratory Data Analysis

Overview

Exploratory Data Analysis (EDA) is valuable to data science (and AI) projects since it allows to get closer to the certainty that the future results will be valid, correctly interpreted, and applicable to the desired business contexts. Such level of certainty can be achieved only after raw data is validated and checked for anomalies, ensuring that the data set was collected without errors. EDA also helps to find insights that were not evident or worth investigating to business stakeholders and data scientists but can be very informative about a particular business.

My Reflection

In the second week, we built on the introduction that we had in the first, starting with the concepts of Exploratory Data Analysis (EDA). The unit's readings seminar introduced the basic concepts of EDA, descriptive statistics, and exploratory data visualisation.

Regarding the team project, colleagues have just started replying to my email. We agreed to continue our collaboration through Microsoft Teams. We had an initial problem of having some colleagues not responding. I tried to reach out to them again over email, as well as looking them up on LinkedIn. Eventually, we reported the problem to the module's tutor.

Artefacts: Collaborative Discussion 1: The 4th Industrial Revolution

Peer responses

In the second unit, we carried on with the discussion initiated in Unit 1, providing peer responses to two of the colleagues' initial posts.

My first peer response

Dear Jordan,

I find your analysis of the Windrush scandal as an example of Industry 4.0's techno-centricity excess very relevant and interesting. The scandal obviously illustrates how systems design, when neglecting human-centricity principle, is highly likely to cause systemic harm. In cases like Windrush, this harm escalates to the level of reputational damage to public institutions highlights the socio-political cost of neglecting human-centricity.

Building on Metcalf’s (2024) framing of Industry 5.0, I would argue that the Windrush case also exposes a critical failure in ethical data stewardship. Also, the Home Office’s reliance on incomplete datasets and rigid automation reflects what Floridi (2018) terms “data-centric opacity," where systems operate without transparency or recourse, especially for vulnerable populations. This aligns with the broader critique that Industry 4.0 often privileges efficiency over empathy (Xu, David and Kim, 2018).

Moreover, the scandal underscores the need for resilience not just in technical infrastructure, but in institutional logic. A resilient system, as advocated by Industry 5.0, would have incorporated feedback loops, human oversight, and contextual safeguards to prevent such injustices. The absence of these mechanisms reveals a governance model that was fragile by design.

All in all, I agree that, in principle, the focus on human-centricity, resilience and sustainability in the framework of Industry 5.0 can lead to systems that are not just more efficient, but more just and humane.

Reference list


My second peer response

Hi Abdulrahman,

I find your analysis of the Texas grid failure as a case study in the limits of Industry 4.0 implementation insightful. The incident not only exposed infrastructural fragility but also highlighted the absence of systemic foresight. I agree that this is an area where Industry 5.0 principles could have made a tangible difference.

Building on Matthew’s point about resilience, I would also add that Industry 5.0’s emphasis on human-machine collaboration is especially relevant in energy systems. As Introna et al. (2024) argue, integrating AI with human decision-making enables dynamic risk assessment and adaptive response strategies, which is critical during cascading failures like the Texas blackout. The lack of predictive modelling and real-time coordination was not just a technical oversight. Rather, it reflected a governance gap where human expertise was under-leveraged.

Moreover, the economic and human toll of the blackout highlights what Brem et al. (2021) describe as the ethical imperative of Industry 5.0: designing systems that safeguard life and livelihood, not just optimise performance. The tragedy was not merely a failure of technology, it was also a failure to anticipate and centre human needs in system design.

All in all, your post effectively bridges the theoretical promise of Industry 5.0 with the practical consequences of its absence. It is a reminder that resilience is not just about redundancy, but it is majorly about responsibility.

Reference List