Geocoding

To perform classification with generalized linear models, see Geocoder.

Geocoder

class soika.src.geocoder.geocoder.Geocoder(df, model_path: str = 'Geor111y/flair-ner-addresses-extractor', device: str = 'cpu', territory_name: str = None, osm_id: int = None, city_tags: dict = {'place': ['state']}, stemmer_lang: str = 'russian', text_column_name: str = 'text')[исходный код]

This class provides a functionality of simple geocoder

run(df: DataFrame = None, tags: dict | None = None, group_column: str | None = 'group_name', search_for_objects=False)[исходный код]

Runs the data processing pipeline on the input DataFrame.

Параметры:
  • tags (dict) – The tags to filter by.

  • date (str) – The date of the data to retrieve.

  • df (pd.DataFrame) – The input DataFrame.

  • text_column (str, optional) – The name of the text column in the DataFrame. Defaults to «text».

Результат:

The processed DataFrame after running the data processing pipeline.

Тип результата:

gpd.GeoDataFrame

This function retrieves the GeoDataFrame of areas corresponding to the given OSM ID and tags. It then preprocesses the area names and matches each group name to an area. The best match and admin level are assigned to the DataFrame. The function also retrieves other geographic objects and street names, preprocesses the street names, finds the word form, creates a GeoDataFrame, merges it with the other geographic objects, assigns the street tag, and returns the final GeoDataFrame.

OtherGeoObjects

class soika.src.geocoder.city_objects_extractor.OtherGeoObjects[исходный код]
static run(osm_id: int, df: DataFrame, text_column: str) DataFrame[исходный код]

Launches the module for extracting urban objects from texts that do not relate to streets.

StreetExtractor

class soika.src.geocoder.street_extractor.StreetExtractor[исходный код]

more: