Script 1031: Tagging social dimension Audience and Segment

Purpose

The Python script processes campaign data to extract and tag audience and segment information based on predefined keywords.

To Elaborate

The Python script is designed to analyze campaign data and identify specific audience and segment tags from campaign names. It uses predefined lists of keywords to determine which audience and segment each campaign belongs to. The script processes each campaign name, extracts relevant keywords, and updates the dataset with this information. This helps in categorizing campaigns for better analysis and reporting. The script also ensures data integrity by removing any rows with missing audience or segment information.

Walking Through the Code

  1. Define Configurable Parameters:
    • The script begins by defining lists of keywords for audiences (AUDIENCE_KEYWORDS) and segments (SEGMENT_KEYWORDS). These lists are user-changeable parameters that determine how campaigns are categorized.
  2. Extract Audience and Segment:
    • A function extract_audience_and_segment is defined to process each campaign name. It converts the name to lowercase and checks for the presence of audience and segment keywords. The function returns the matched keywords as strings.
  3. Copy and Process Data:
    • The script creates a copy of the input data (inputDf) to outputDf for processing. It iterates over each row of the input data, extracting the campaign name and using the function to determine the audience and segment.
  4. Update and Clean Data:
    • For each campaign, the extracted audience and segment are printed and then updated in the outputDf. The script removes any rows from outputDf that have missing values in the audience or segment columns to ensure completeness.
  5. Output Results:
    • Finally, the script checks if the outputDf is empty and prints the results in a tabular format if data is present.

Vitals

  • Script ID : 1031
  • Client ID / Customer ID: 1306927457 / 60270313
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Audience, Segment
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Autumn Archibald (aarchibald@marinsoftware.com)
  • Created by Autumn Archibald on 2024-04-30 00:09
  • Last Updated by Autumn Archibald on 2024-04-30 00:13
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# Define the configurable parameters for the script
AUDIENCE_KEYWORDS = ['pricing', 'downloads', 'awv']
SEGMENT_KEYWORDS = ['b', 'nb', 'd', 'a', 'v', 'n']
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_AUDIENCE = 'Audience'
BULK_COL_SEGMENT = 'Segment'

# Function to extract audience and segment from campaign name
def extract_audience_and_segment(campaign_name):
    campaign_lower = campaign_name.lower()
    audience_list = [keyword for keyword in AUDIENCE_KEYWORDS if keyword in campaign_lower]
    segment_list = re.findall(r'\b(?:' + '|'.join(SEGMENT_KEYWORDS) + r')\b', campaign_lower)
    return ', '.join(audience_list), ', '.join(segment_list)

# Copy input rows to output
outputDf = inputDf.copy()

# Loop through all rows
for index, row in inputDf.iterrows():
    campaign_name = row[RPT_COL_CAMPAIGN]

    # Extract audience and segment from campaign name
    audience, segment = extract_audience_and_segment(campaign_name)
    print("Campaign [%s] => Audience [%s], Segment [%s]" % (campaign_name, audience, segment))

    # Update columns
    outputDf.at[index, BULK_COL_AUDIENCE] = audience
    outputDf.at[index, BULK_COL_SEGMENT] = segment

# Drop any rows with missing values
outputDf = outputDf.dropna(subset=[BULK_COL_AUDIENCE, BULK_COL_SEGMENT])

if not outputDf.empty:
    print("outputDf", tableize(outputDf))
else:
    print("Empty outputDf")

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus