Data Engineer - NLP
HarnhamFull Description
Contract Opportunity: Data Scientist / Data Engineer (Python, Databricks, Document Parsing & Web Scraping)
A leading organisation is seeking an experienced Data Scientist / Data Engineer to join their team on a contract basis. The successful candidate will be responsible for developing and optimising data workflows in a Databricks environment, with a strong focus on Python scripting, document parsing, and web scraping.
Key Responsibilities
* Design, build, and maintain efficient Python scripts for data extraction, transformation, and analysis.
* Develop and manage scalable data pipelines within a Databricks environment.
* Implement solutions for automated document parsing (structured and unstructured data).
* Design and deploy robust web scraping workflows to capture external data sources.
* Collaborate with internal stakeholders to translate business requirements into data-driven solutions.
* Ensure best practices in data quality, integrity, and compliance throughout all workflows.
Key Skills & Experience
* Proven expertise in Python programming for data engineering tasks.
* Hands-on experience with Databricks (or similar Spark-based platforms).
* Strong knowledge of document parsing libraries and approaches (e.g. PDF/text extraction, NLP preprocessing).
* Experience developing scalable and ethical web scraping solutions.
* Familiarity with data governance, compliance, and secure handling of sensitive data.
* Strong problem-solving skills and ability to work independently in a contract/consulting capacity.
Contract Details
* Contract role (initial [insert duration, e.g., 6 months])
* Location: Amsterdam
* Competitive day rate (depending on experience)
This is an excellent opportunity for a skilled Data Scientist / Data Engineer who thrives in a contract environment and enjoys working on complex data extraction and processing challenges.