VKparser

class soika.src.utils.data_getter.vk_data_getter.VKParser[исходный код]
API_VERISON = '5.131'
COUNT_ITEMS = 100
TIMEOUT_LIMIT = 15
static comments_to_dataframe(comments)[исходный код]

Convert comments to a DataFrame.

Параметры:

comments – List of comments to be converted.

Результат:

A DataFrame containing specific columns from the input comments.

Тип результата:

DataFrame

get_comments(owner_id, post_id, access_token)[исходный код]

Get comments for a post on VK using the specified owner ID, post ID, and access token.

Параметры:
  • owner_id (int) – The ID of the post owner.

  • post_id (int) – The ID of the post.

  • access_token (str) – The access token for authentication.

Результат:

A list of dictionaries containing comment information.

Тип результата:

list

static get_group_name(domain, accsess_token)[исходный код]
static get_group_post_ids(domain, access_token, post_num_limit, step) list[исходный код]

A static method to retrieve a list of post IDs for a given group, based on the owner ID, access token, post number limit, and step size. Returns a list of post IDs.

static get_owner_id_by_domain(domain, access_token)[исходный код]

Get the owner ID of a VK group by its domain.

Параметры:
  • domain (str) – The domain of the VK group.

  • access_token (str) – The access token for the VK API.

Результат:

The owner ID of the VK group, or None if the request was not successful.

Тип результата:

int

static get_subcomments(params)[исходный код]

Retrieves subcomments from the VK API.

Параметры:
  • owner_id (int) – The ID of the owner of the comments.

  • post_id (int) – The ID of the post.

  • access_token (str) – The access token for authentication.

  • params (dict) – Additional parameters for the API request.

Результат:

A list of subcomments retrieved from the API.

Тип результата:

list

static run_comments(domain, post_ids, access_token)[исходный код]
static run_parser(domain, access_token, cutoff_date, number_of_messages=inf, step=100)[исходный код]

Runs the parser with the given parameters and returns a combined DataFrame of posts and comments.

Параметры:
  • owner_id – The owner ID for the parser.

  • access_token – The user token for authentication.

  • step – The step size for fetching data.

  • cutoff_date – The cutoff date for fetching data.

  • number_of_messages – The maximum number of messages to fetch. Defaults to positive infinity.

Результат:

A combined DataFrame of posts and comments.

static run_posts(domain, access_token, cutoff_date, number_of_messages=inf, step=50)[исходный код]

A function to retrieve posts from a social media API based on specified parameters.

Параметры:
  • owner_id (int) – The ID of the owner whose posts are being retrieved.

  • access_token (str) – The authentication token for accessing the API.

  • step (int) – The number of posts to retrieve in each API call.

  • cutoff_date (str) – The date to stop retrieving posts (format: „%Y-%m-%d“).

  • number_of_messages (float) – The maximum number of messages to retrieve (default is infinity).

Результат:

A DataFrame containing the retrieved posts.

Тип результата:

pandas.DataFrame