Script 1017: Tagging dimension Script Campaigns Topic

Purpose:

The Python script processes campaign data to extract and assign topics based on a structured naming convention.

To Elaborate

The Python script is designed to process a dataset containing campaign information, specifically focusing on extracting a “Topic” from each campaign’s name. The campaign names follow a structured format where different components are separated by a designated separator (underscore by default). The script identifies and extracts the topic based on its position within the campaign name, which is defined by user-configurable parameters. The extracted topics are then added to the dataset, and any rows without a valid topic are removed. This process helps in organizing and categorizing campaign data efficiently, allowing for better analysis and reporting.

Walking Through the Code

  1. Configurable Parameters:
    • The script begins by defining several parameters that control its behavior, such as the separator used in campaign names (SEP), and the positions of the tag and topic within the campaign name (TAG_LOCATION and TOPIC_LOCATION).
  2. Function Definition:
    • A function extract_topic is defined to extract the topic from a campaign name. It splits the name using the separator and retrieves the segment at the specified topic location.
  3. Data Preparation:
    • The input DataFrame (inputDf) is copied to outputDf to preserve the original data while making modifications.
  4. Processing Loop:
    • The script iterates over each row in the input DataFrame, extracting the campaign name and using the extract_topic function to determine the topic. The extracted topic is then assigned to the corresponding row in the output DataFrame.
  5. Data Cleaning:
    • After processing all rows, the script removes any rows from the output DataFrame that have missing topic values to ensure data integrity.
  6. Output:
    • Finally, the script checks if the output DataFrame is empty and prints the results, providing a tabular view of the processed data if available.

Vitals

  • Script ID : 1017
  • Client ID / Customer ID: 1306927457 / 60270313
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Topic
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Autumn Archibald (aarchibald@marinsoftware.com)
  • Created by Autumn Archibald on 2024-04-29 19:12
  • Last Updated by Autumn Archibald on 2024-04-29 19:13
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Define the configurable parameters for the script
SEP = '_'  # Separator used in campaign names
TAG_LOCATION = 0  # First word before the first separator
TOPIC_LOCATION = 1  # Second word before the second separator
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_TOPIC = 'Topic'

# Function to extract Topic from campaign name
def extract_topic(campaign_name):
    if SEP in campaign_name:
        segments = campaign_name.split(SEP)
        topic = segments[TOPIC_LOCATION] if len(segments) > TOPIC_LOCATION else ''
        return topic
    else:
        return ''

# Copy input rows to output
outputDf = inputDf.copy()

# Loop through all rows
for index, row in inputDf.iterrows():
    campaign_name = row[RPT_COL_CAMPAIGN]

    # Extract Topic from campaign name
    topic = extract_topic(campaign_name)
    print("Campaign [%s] => Topic [%s]" % (campaign_name, topic))

    # Update Topic column
    outputDf.at[index, BULK_COL_TOPIC] = topic

# Drop any rows with missing Topic values
outputDf = outputDf.dropna(subset=[BULK_COL_TOPIC])

if not outputDf.empty:
    print("outputDf", tableize(outputDf))
else:
    print("Empty outputDf")

Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus