The Clean Up Crew
- Nikolai Hegelstad nikolheg
- Milena Tosic milenato
- Eirik Berg Nordheim eirikbno
- 1 Repository link
- 2 Milestones
- 3 Problem, interpretation
- 4 Assumptions
- 5 Summary of requirements
- 6 Time schedule
- 7 How we are dividing tasks within the group
- 8 Screenshots and screen flows
- 9 Documented learning during project
- 10 Suggested improvements to the API
- 11 Unresolved issues
- First meeting - 7th of October
- Created wiki page - 25th of October
- First project meeting - 25th of October
- Project deadline - 29th of November
- Project presentation - 7th of December
The task, A, is made up of two subtasks:
- Find duplicate Singleton events.
- Find duplicate Tracked Entity Instances.
To our understanding, the singletons are mainly minor tasks, not linked to a specific person. And the duplicates often occur when you have personnel that are manually transfering paper forms to the dhis2 system without being properly informed of the fact that other personnel, maybe the person working an earlier shift, already performed this task. We therefore assume that singleton events are exact duplicates, i.e. without misspellings (or at least few in general).
The TEIS on the other hand are mainly due to misspellings. For instance: A doctor asks for a patients name to look him up in the database, the patient replies "Bob Bobson" but the doctor looks up "Bob Robson", since he can't find a "Bob Robson" he desides to created a new instance for this person and we end up with two TEI's representing the same person, the original "Bobson" and the duplicate "Robson". We are therefore mainly interested in doing a fuzzy string search on the TEI's first and last name to check for duplicates. Of course there are many other fields which may or may not help with verifying that these two are indeed the same person, but our algorithm will only focus on the misspelling of the first + last name and leave the comparison of the other attributes to the user for visual confirmation.
Our app is supposed to find these duplicates, let the user of the program confirm / deny that these are in fact duplicates, and then export these duplicates to be handled by an administrator later on.
- the search is performed within specified clinic
- we don't perform search across different programs
- we search for duplicates during the specified time period
- two singleton events are marked as duplicates if they have exactly the same dataElements
Tracked Entity Instances
- the search is performed within specified clinic or chiefdom
- two TEIs are marked as duplicates if they have similar first name and last name.
- For a more thorough search, use Maiden name, TB Number and National Identifier.
Summary of requirements
- The user should be given some initial information and be able to chose between searching for Singletons or TEIS. - Acceptance test result: Accomplished
- The user should then be redirected to the appropriate functionality. Acceptance test result: Accomplished
- The user should be prompted to select an organisation unit of appropriate level and the search for duplicates should start. Acceptance test result: Accomplished
- The user should then be displayed the duplicates for visual confirmation and be able to select/unselect true / false positives. Acceptance test result: Accomplished
- Finally the user should be able to export the singletons as JSON file. Acceptance test result: Unaccomplished
During the project period, every member strive to work approx. 2 hours a day on the project.
This was upheld for the whole period with the exception of the exam period.
How we are dividing tasks within the group
Our approach to completing this project are divided into two parts.
- Part 1:
During this part we will collaborate as much as possible to create a MVP that serves as the foundation that we will be able to build on.
- Part 2:
During this part we define clear tasks for each group member based on their wishes and past experiences. This will be done during the group meetings.
Singleton algorithms: Milena
Tracked Entity Instance algorithms: Eirik
Layout, Redux: Nikolai
Screenshots and screen flows
Documented learning during project
- React - Learned a tremendous amount
- Redux - Learned a tremendous amount
- Fuzzy string algorithms
- NPM ecosystem
Suggested improvements to the API
- Mark offline registries (android) as probable duplicates.
- Validate upon registration whether a patient is a possible duplicate.
Exporting data to the DHIS dataStore didn't work.