A circular diagram of the data life cycle, this was taken from https://nbisweden.github.io/module-organising-data-dm-practices/101-introduction-to-data-organization/index.html

Data Organisation

June 2022 - June 2023

When I joined CC Data in 2020, the data function was decentralized, with data analysts from different teams working on disparate projects. Shared resources such as a GitHub repository, servers, and surveillance applications were managed haphazardly. Multiple problems resulted from this disjointed approach, such as code duplication, inefficient resource allocation, and a lack of shared best practices. Moreover, some critical revenue-generating processes were solely known by specific staff, preventing them from taking a proper holiday.

My primary task was to streamline the data organization, fostering a more collaborative and efficient environment. I aimed to resolve the prevalent problems of duplication of effort, lack of coordination between teams, inefficient resource management, and absence of clear ownership for critical processes.

The first step towards restructuring was fostering regular communication. I initiated weekly touchpoints and quarterly retrospectives to discuss shared problems, devise solutions, and track progress.

To address the absence of shared best practices and code documentation, I set up a dedicated space on Notion. We created an analyst handbook, guides for our Kubernetes cluster, and documentation on resource location.

To ensure smoother operations, I introduced quarterly time reporting for analysts and forecasting of analyst availability. This improved project planning and made better use of the analysts' time.

I reorganized the GitHub repository based on product, designating clear owners for each section to reduce code duplication. I also implemented clear ownership of shared resources and used weekly planning sessions to discuss any upcoming issues.

For a more robust application environment, I transitioned production code from development servers to a Kubernetes cluster, segregating development and production environments.

To further improve code quality and collaboration, I developed libraries for common functions, implemented pre-commit hooks for code standardization, and fostered a similar coding style across teams.

These changes brought significant improvements to CC-Data's operations. The migration to Kubernetes improved visibility of code-related issues, enabling proactive problem resolution before clients were impacted. Having clear resource ownership provided the DevOps team with precise points of contact, leading to more efficient issue resolution and cost savings.

The introduction of regular team meetings and time reporting improved project planning, enabling teams to work together more effectively. It also helped to ensure continuous operation, even during prolonged staff absences.

The Notion documentation and standardized code practices reduced the time spent on rediscovering resources and reworking previously done projects. It also facilitated inter-team collaboration and improved the efficiency of processes.

The changes I introduced at CC-Data led to a more collaborative and efficient environment, improving operations, enhancing client satisfaction, and ultimately increasing productivity.