Codelist mapping is one of the main problems when generating SDTM or SEND datasets, especially when no CDISC controlled terminology was used during data capture.
Therefore codelist mapping is the first area where we have added new "smart" tools for automating this process. In SDTM-ETL v.3.3 automated codelist mapping can now be done on basis of semantic similarity between codes, their name (meaning) and CDISC and company-specific lists of synonyms.
Hereby an example of automated codelist mapping for LBTEST
Before clicking "attempt 1:1 mapping"
After clicking "attempt 1:1 mapping"
Remark the checkboxes "Also use CDISC Synonym List", allowing to include the CDISC synonyms in the search (at the cost of speed) and the checkbox "Also use Company Synonym List", allowing to search into a company-specific list of synonyms (mappings) for CDISC controlled terms. The latter is a currently a simple text file, but also an interface to a company metadata repository could be added.
In case that your study does not use CDISC controlled terminology, it might use company-internal controlled terminology. For example,
"Systolic Blood Pressure" may usually be abbreviated as "Systolic BP" in your company.
To further enable to automate codelist mapping, you can now store your "company synonyms" in a file (Company_CT.txt), and use that when performing automated codelist mapping. This is where the checkbox "Also use Company Synonym List" for.
You can however extend this list from within the software. For example, our list with "company synonyms" (in the file Company_CT) consists of:
When using the mapping wizard and selecting the following mappings:
Showing that the item "Systolic BP" (our company abbreviation/name) has been mapped to "SYSBP" with NCI code "C25298"
Remark that the checkbox "Ask to store mappings as Synonyms to Company List" is checked.
As a consequence, the following dialog that is shown is:
When then clicking "OK", the file "Company_CT_file" is automatically extended and updated:
listing the CDISC NCI code (as that is the unique identifier) and the company's own synonym.
Please remark that for a single CDISC NCI code, there can be more than one company synonym, and that the file can also be updated and edited manually. Also, when updating the list from within the software, a check is made for duplicates.
These synonyms can then be used to further enhance codelist mapping in a following mapping, e.g. in another study.