Script 1017: Tagging dimension Script Campaigns Topic
Purpose:
The Python script processes campaign data to extract and assign topics based on a structured naming convention.
To Elaborate
The Python script is designed to process a dataset containing campaign information, specifically focusing on extracting a “Topic” from each campaign’s name. The campaign names follow a structured format where different components are separated by a designated separator (underscore by default). The script identifies and extracts the topic based on its position within the campaign name, which is defined by user-configurable parameters. The extracted topics are then added to the dataset, and any rows without a valid topic are removed. This process helps in organizing and categorizing campaign data efficiently, allowing for better analysis and reporting.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining several parameters that control its behavior, such as the separator used in campaign names (
SEP
), and the positions of the tag and topic within the campaign name (TAG_LOCATION
andTOPIC_LOCATION
).
- The script begins by defining several parameters that control its behavior, such as the separator used in campaign names (
- Function Definition:
- A function
extract_topic
is defined to extract the topic from a campaign name. It splits the name using the separator and retrieves the segment at the specified topic location.
- A function
- Data Preparation:
- The input DataFrame (
inputDf
) is copied tooutputDf
to preserve the original data while making modifications.
- The input DataFrame (
- Processing Loop:
- The script iterates over each row in the input DataFrame, extracting the campaign name and using the
extract_topic
function to determine the topic. The extracted topic is then assigned to the corresponding row in the output DataFrame.
- The script iterates over each row in the input DataFrame, extracting the campaign name and using the
- Data Cleaning:
- After processing all rows, the script removes any rows from the output DataFrame that have missing topic values to ensure data integrity.
- Output:
- Finally, the script checks if the output DataFrame is empty and prints the results, providing a tabular view of the processed data if available.
Vitals
- Script ID : 1017
- Client ID / Customer ID: 1306927457 / 60270313
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Topic
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Autumn Archibald (aarchibald@marinsoftware.com)
- Created by Autumn Archibald on 2024-04-29 19:12
- Last Updated by Autumn Archibald on 2024-04-29 19:13
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Define the configurable parameters for the script
SEP = '_' # Separator used in campaign names
TAG_LOCATION = 0 # First word before the first separator
TOPIC_LOCATION = 1 # Second word before the second separator
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
BULK_COL_ACCOUNT = 'Account'
BULK_COL_CAMPAIGN = 'Campaign'
BULK_COL_TOPIC = 'Topic'
# Function to extract Topic from campaign name
def extract_topic(campaign_name):
if SEP in campaign_name:
segments = campaign_name.split(SEP)
topic = segments[TOPIC_LOCATION] if len(segments) > TOPIC_LOCATION else ''
return topic
else:
return ''
# Copy input rows to output
outputDf = inputDf.copy()
# Loop through all rows
for index, row in inputDf.iterrows():
campaign_name = row[RPT_COL_CAMPAIGN]
# Extract Topic from campaign name
topic = extract_topic(campaign_name)
print("Campaign [%s] => Topic [%s]" % (campaign_name, topic))
# Update Topic column
outputDf.at[index, BULK_COL_TOPIC] = topic
# Drop any rows with missing Topic values
outputDf = outputDf.dropna(subset=[BULK_COL_TOPIC])
if not outputDf.empty:
print("outputDf", tableize(outputDf))
else:
print("Empty outputDf")
Post generated on 2025-03-11 01:25:51 GMT