Data Labeling: Accurately label and categorize a variety of documents, including text-based documents (contracts, resumes, etc.) and image-based documents (passports, drivers licenses, credit cards, etc.).
Data Extraction: Extract specific information from documents, such as names, addresses, phone numbers, passport numbers, and other relevant entities.
Data Cleaning and Preparation: Clean and preprocess data to ensure accuracy and consistency.
Regular Expression Development: Develop and apply regular expressions to identify and extract specific patterns within text data.
Tool Usage: Utilize data annotation tools to efficiently and accurately label and annotate data.
Quality Assurance: Ensure the quality and accuracy of labeled data.
Collaboration: Work closely with team members and data scientists to refine labeling guidelines and improve annotation processes.
Required Skills and Qualifications:
Strong attention to detail and accuracy
Excellent analytical and problem-solving skills
Ability to work independently and as part of a team
Proficiency in using data annotation tools
Strong understanding of regular expressions
Ability to learn new tools and techniques quickly
Strong organizational and time management skills
Preferred Qualifications:
Experience in data annotation or a related field
Knowledge of machine learning and natural language processing concepts