Extract Drugs
This module extracts drugs from the records.
Overview
This module extracts drugs from the records.
It utilizes the drug columns (in order) listed in the config file (config.json).
It also requires you to have the drug extraction tool installed.
command(input_fpath, target_column, search_words)
Build the command to run the drug extraction tool.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_fpath |
str
|
path to the input file |
required |
target_column |
str
|
the column to search |
required |
search_words |
str
|
the search terms |
required |
Returns:
Type | Description |
---|---|
list[str]
|
list[str]: the command (list) to run |
Source code in opendata_pipeline/extract_drugs.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
|
enhance_drug_output(record, target_column, column_level, data_source, tag_lookup)
Enhance drug output with additional columns.
Source code in opendata_pipeline/extract_drugs.py
70 71 72 73 74 75 76 77 78 79 80 81 82 |
|
export_drug_output(drug_results)
Export the drug output to a file.
Source code in opendata_pipeline/extract_drugs.py
134 135 136 137 138 |
|
fetch_drug_search_terms()
Fetch drug search terms from the remote github repo.
Returns:
Type | Description |
---|---|
dict[str, str]
|
dict[str, str]: a dictionary of search terms and their tags |
Source code in opendata_pipeline/extract_drugs.py
19 20 21 22 23 24 25 26 27 28 29 |
|
read_drug_output()
Read the drug output file and yield each record.
Source code in opendata_pipeline/extract_drugs.py
63 64 65 66 67 |
|
run(settings)
Run the drug extraction tool.
Source code in opendata_pipeline/extract_drugs.py
141 142 143 144 145 146 147 148 149 150 151 |
|
run_drug_tool(config, tag_lookup)
Run the drug extraction tool.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config |
models.DataSource
|
the data source config |
required |
tag_lookup |
dict[str, str]
|
the drug search terms |
required |
Returns:
Type | Description |
---|---|
list[dict[str, Any]]
|
list[dict[str, Any]]: the drug results |
Source code in opendata_pipeline/extract_drugs.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
|