Script 1017: Tagging dimension Script Campaigns Topic

Purpose

The script processes campaign names to extract and assign topics, updating a dataset accordingly.

To Elaborate

The Python script is designed to process a dataset containing campaign information, specifically focusing on extracting a “Topic” from each campaign’s name. The campaign names are structured with specific separators, and the script identifies and extracts the topic based on predefined positions within the name. This extracted topic is then used to update the dataset, ensuring that each campaign entry has an associated topic. The script also includes functionality to remove any entries that do not have a valid topic, thereby cleaning the dataset for further analysis or reporting. This process is crucial for maintaining organized and meaningful data, which can be used for structured budget allocation (SBA) and other analytical purposes.

Walking Through the Code

  1. Define Configurable Parameters:
    • The script begins by setting up several parameters that dictate how campaign names are processed. These include the separator used in campaign names (SEP), and the positions of the tag and topic within the name (TAG_LOCATION and TOPIC_LOCATION).
  2. Extract Topic Function:
    • A function extract_topic is defined to handle the extraction of the topic from a campaign name. It splits the name using the defined separator and retrieves the segment at the specified topic location.
  3. Copy Input Data:
    • The script creates a copy of the input dataset (inputDf) to work on, ensuring the original data remains unchanged.
  4. Iterate Through Rows:
    • It loops through each row of the dataset, extracting the campaign name and using the extract_topic function to determine the topic. This topic is then printed for verification and assigned to the corresponding row in the output dataset.
  5. Clean Output Data:
    • After processing all rows, the script removes any rows from the output dataset that do not have a topic, ensuring only complete entries are retained.
  6. Output Results:
    • Finally, the script checks if the output dataset is empty and prints the results, providing a table view of the processed data if available.

Vitals

  • Script ID : 1017
  • Client ID / Customer ID: 1306927457 / 60270313
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Topic
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Autumn Archibald (aarchibald@marinsoftware.com)
  • Created by Autumn Archibald on 2024-04-29 19:12
  • Last Updated by Autumn Archibald on 2024-04-29 19:13
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Define the configurable parameters for the script
SEP = '_'  # Separator used in campaign names
TAG_LOCATION = 0  # First word before the first separator
TOPIC_LOCATION = 1  # Second word before the second separator
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_TOPIC = 'Topic'

# Function to extract Topic from campaign name
def extract_topic(campaign_name):
    if SEP in campaign_name:
        segments = campaign_name.split(SEP)
        topic = segments[TOPIC_LOCATION] if len(segments) > TOPIC_LOCATION else ''
        return topic
    else:
        return ''

# Copy input rows to output
outputDf = inputDf.copy()

# Loop through all rows
for index, row in inputDf.iterrows():
    campaign_name = row[RPT_COL_CAMPAIGN]

    # Extract Topic from campaign name
    topic = extract_topic(campaign_name)
    print("Campaign [%s] => Topic [%s]" % (campaign_name, topic))

    # Update Topic column
    outputDf.at[index, BULK_COL_TOPIC] = topic

# Drop any rows with missing Topic values
outputDf = outputDf.dropna(subset=[BULK_COL_TOPIC])

if not outputDf.empty:
    print("outputDf", tableize(outputDf))
else:
    print("Empty outputDf")

Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus