Script 1325: Scripts Campaign Tagging
Purpose:
The Python script parses campaign names to extract and tag seminar details such as format, location, registration target, and seminar code.
To Elaborate
The script is designed to automate the extraction of specific details from campaign names and tag them with corresponding dimensions for seminar details. It processes a dataset containing campaign names and extracts four key pieces of information: the format of the seminar, the location, the registration target, and a unique seminar code. These details are then added as new columns to the dataset, facilitating structured budget allocation (SBA) and analysis. The script assumes a specific structure in the campaign names, such as the format being the first word, the location being a sequence of words before a month name, the registration target being a number followed by ‘hh’, and the seminar code following a ‘DN-‘ pattern. This structured approach allows for consistent and automated data tagging, which is crucial for efficient data management and reporting.
Walking Through the Code
- Configurable Parameters:
- The script begins by defining configurable parameters, including the primary data source (
inputDf
) and the names of the output columns for the extracted details.
- The script begins by defining configurable parameters, including the primary data source (
- Function Definition:
- A function
extract_seminar_details
is defined to parse the campaign name. It initializes default values for format, location, registration target, and seminar code. - The campaign name is split into parts, and the first part is assumed to be the format.
- The location is extracted from the parts following the format until a part containing a month name is encountered.
- The registration target is identified using a regular expression that searches for a number followed by ‘hh’.
- The seminar code is extracted using a regular expression that matches the pattern ‘DN-‘ followed by digits.
- A function
- Data Processing:
- The script creates a copy of the input DataFrame (
outputDf
) to preserve the original data. - It applies the
extract_seminar_details
function to the ‘Campaign’ column, populating the new columns with the extracted details.
- The script creates a copy of the input DataFrame (
- Output:
- The script checks if the output DataFrame is not empty and then displays it with the new columns added. If the DataFrame is empty, it prints a message indicating so.
Vitals
- Script ID : 1325
- Client ID / Customer ID: 1306928223 / 60270455
- Action Type: Bulk Upload (Preview)
- Item Changed: Campaign
- Output Columns: Account, Campaign, Location, Reg Target, Seminar Code, Format
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: Grégory Pantaine (gpantaine@marinsoftware.com)
- Created by Grégory Pantaine on 2024-08-14 17:51
- Last Updated by emerryfield@marinsoftware.com on 2024-08-15 18:37
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
##
## Name: Campaign Name - Dimension Auto Tagging: Seminar Details
## Description:
## Parse Campaign Name and add Campaign-level Marin Dimensions Tags for Seminar Details.
## Extracts Format, Location, Reg Target, and Seminar Code from the Campaign Name.
##
## author: Gregory Pantaine
## created: 2024-08-12
##
########### Configurable Params - START ##########
# Primary data source and columns
inputDf = dataSourceDict["1"] # Assuming 'Campaign' column exists in this DataFrame
# Output columns
OUTPUT_COL_FORMAT = 'Format'
OUTPUT_COL_LOCATION = 'Location'
OUTPUT_COL_REG_TARGET = 'Reg Target'
OUTPUT_COL_SEMINAR_CODE = 'Seminar Code'
CAMPAIGN_COL = 'Campaign'
# Function to extract seminar details from the campaign name
def extract_seminar_details(campaign_name):
# Initialize default values
format = None
location = None
reg_target = None
seminar_code = None
# Split the campaign name into parts based on spaces
parts = campaign_name.split()
# Extract Format (assumed to be the first part)
if len(parts) >= 1:
format = parts[0]
# Extract Location (assumed to be from second part up to a part containing a month name)
months = ['January', 'February', 'March', 'April', 'May', 'June',
'July', 'August', 'September', 'October', 'November', 'December']
location_parts = []
for part in parts[1:]:
if part in months:
break
location_parts.append(part)
if location_parts:
location = ' '.join(location_parts).rstrip(',')
# Extract Reg Target (number preceding 'hh')
reg_target_match = re.search(r'(\d+)\w*hh', campaign_name)
if reg_target_match:
reg_target = reg_target_match.group(1)
# Extract Seminar Code (pattern: DN- followed by digits)
seminar_code_match = re.search(r'(DN-\d+)', campaign_name)
if seminar_code_match:
seminar_code = seminar_code_match.group(1)
return pd.Series([format, location, reg_target, seminar_code])
# Copy all input rows to output
outputDf = inputDf.copy()
# Apply the extraction function to the 'Campaign' column
outputDf[[OUTPUT_COL_FORMAT, OUTPUT_COL_LOCATION, OUTPUT_COL_REG_TARGET, OUTPUT_COL_SEMINAR_CODE]] = outputDf[CAMPAIGN_COL].apply(extract_seminar_details)
# Display the output DataFrame with the new columns
if not outputDf.empty:
print("outputDf", tableize(outputDf[[CAMPAIGN_COL, OUTPUT_COL_FORMAT, OUTPUT_COL_LOCATION, OUTPUT_COL_REG_TARGET, OUTPUT_COL_SEMINAR_CODE]]))
else:
print("Empty outputDf")
########### Configurable Params - END ##########
Post generated on 2025-03-11 01:25:51 GMT