StreetExtractor¶
- class soika.src.geocoder.street_extractor.StreetExtractor[исходный код]
- static extract_ner_street(text: str, classifier) Series[исходный код]
Extract street addresses from text using a pre-trained custom NER model.
This function processes text by removing unnecessary content, applies a custom NER model to extract mentioned addresses, and returns the address with a confidence score.
- Параметры:
text (str) – The input text to process and extract addresses from.
- Результат:
- A Series containing the extracted address and confidence score,
or [None, None] if extraction fails or the score is below the threshold.
- Тип результата:
pd.Series
- static extract_toponym(text: str, street_name: str) str | None[исходный код]
Extract toponyms near the specified street name in the text.
This function identifies the position of a street name in the text and searches for related toponyms within a specified range around the street name.
- Параметры:
text (str) – The text containing the address.
street_name (str) – The name of the street to search around.
- Результат:
The first toponym found if present, otherwise None.
- Тип результата:
Optional[str]
- extractor = <soika.src.geocoder.text_address_extractor_by_rules.NatashaExtractor object>
- static process_pipeline(df: DataFrame, text_column: str, classifier) DataFrame[исходный код]