Data Processing Scripts on Statistics Denmark's server

At the Agency for Digital Government, one of my team’s main tasks was to analyze Danish citizens’ trust in the digital public sector. Each year, we conducted a nation wide survey and published a report with the findings.

Previously, we had to pay Statistics Denmark (DST) to extract or prepare data for analysis. I designed and built a set of Python scripts that automated the full processing pipeline, enabling us to access the data ourselves. This gave our team direct control of the data and reduced the time and cost of producing the analysis-ready datasets used for the reports.

Overview & My Role

I designed the technical setup that enabled our team to efficiently prepare data for analysis. This included planning the workflow, structuring the scripts, and building checks to ensure outputs could be reused reliably year over year. The scripts automated the core steps: merging survey data with DST registers, recoding variables, and producing export tables for analysis and reporting.

  • My role: workflow planning, script development, QA checks
  • Impact: faster data preparation, consistent outputs across survey years, reduced manual workload

Tech & Tools

  • Language: Python (pandas, NumPy)
  • Environment: Jupyter Notebooks on DST Research Server
  • Outputs: Clean datasets & structured crosstabs for reporting

Process Overview

Process Data

Combine survey results with DST register data and prepare a validated dataset for analysis.

Export Datasets

Produce descriptive tables (crosstabs, distributions) and structured datasets in CSV.

Create Reporting

Analyze survey results using the the prepared datasets and publish the official trust report.