Script 805: Pacing Dimension Value from Campaign Name
Purpose
The Python script extracts and organizes specific information from campaign names into a structured format for further analysis.
To Elaborate
The script is designed to parse campaign names to extract specific details such as start and end dates, goals, and target metrics, which are then organized into a structured DataFrame. This process is crucial for managing and analyzing marketing campaigns, as it allows for the automatic extraction of key data points embedded within campaign names. The script uses regular expressions to identify and extract these details, ensuring that the data is accurately captured and formatted. The extracted information is then used to populate new columns in the DataFrame, which can be filtered to include only rows with complete data. This structured approach facilitates better tracking and analysis of campaign performance, enabling more informed decision-making.
Walking Through the Code
- Function Definition
- The script defines a function
extract_info_from_campaign_name_enhanced
that uses a regular expression to parse campaign names and extract start and end dates, goals, and target metrics. - The function attempts to convert extracted date strings into
datetime
objects, returningNone
if conversion fails.
- The script defines a function
- DataFrame Preparation
- New columns for ‘Pacing - Start Date’, ‘Pacing - End Date’, ‘Goal’, and ‘Target (Impr/Spend/Views)’ are added to the input DataFrame (
inputDf
), initialized withNaN
values.
- New columns for ‘Pacing - Start Date’, ‘Pacing - End Date’, ‘Goal’, and ‘Target (Impr/Spend/Views)’ are added to the input DataFrame (
- Data Extraction
- The script iterates over each row in
inputDf
, applying the extraction function to the ‘Campaign’ column. - If all required information is successfully extracted, the corresponding DataFrame row is updated with this data.
- The script iterates over each row in
- Data Filtering and Output
- Rows with complete extracted information are retained in a new DataFrame (
filteredDf
). - A subset of columns is selected to create the final output DataFrame (
outputDf
), which is then printed for verification.
- Rows with complete extracted information are retained in a new DataFrame (
Vitals
- Script ID : 805
- Client ID / Customer ID: 1306927731 / 60270139
- Action Type: Bulk Upload
- Item Changed: Campaign
- Output Columns: Account, Campaign, Goal, Pacing - End Date, Pacing - Start Date, Target (Impr/Spend/Views)
- Linked Datasource: M1 Report
- Reference Datasource: None
- Owner: ascott@marinsoftware.com (ascott@marinsoftware.com)
- Created by ascott@marinsoftware.com on 2024-03-13 17:57
- Last Updated by ascott@marinsoftware.com on 2024-03-13 18:28
> See it in Action
Python Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
## name: Dimension Tags from Campaign Name
## description:
##
##
## author:
## created: 2023-12-04
##
# Column Definitions
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_PACING_START_DATE = 'Pacing - Start Date'
RPT_COL_PACING_END_DATE = 'Pacing - End Date'
RPT_COL_GOAL = 'Goal'
RPT_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'
# Function to extract information from the campaign name with enhanced logic
def extract_info_from_campaign_name_enhanced(campaign_name):
pattern = r'(.*?)[-_]\s*([\d]{1,2}[\-/][\d]{1,2}[\-/][\d]{4})\s*[-_]\s*([\d]{1,2}[\-/][\d]{1,2}[\-/][\d]{4})\s*[-_]\s*(MS|MS |CPM|CPM |CPV|CPV )\s*[-_]\s*([\d,]+(?:\.\d+)?)'
match = re.search(pattern, campaign_name)
if match:
start_date_str, end_date_str, goal, target_impr_per_spendviews = match.groups()[1:]
def convert_date(date_str):
try:
return datetime.datetime.strptime(date_str, '%m/%d/%Y').date()
except ValueError:
return None
start_date = convert_date(start_date_str)
end_date = convert_date(end_date_str)
return start_date, end_date, goal, target_impr_per_spendviews
else:
return None, None, None, None
# Adding columns for extracted information to inputDf
for column_name in ['Pacing - Start Date', 'Pacing - End Date', 'Goal', 'Target (Impr/Spend/Views)']:
inputDf[column_name] = np.nan
# Process each row in inputDf to extract information
for index, row in inputDf.iterrows():
start_date, end_date, goal, target_impr_per_spendviews = extract_info_from_campaign_name_enhanced(row[RPT_COL_CAMPAIGN])
if start_date and end_date and goal and target_impr_per_spendviews:
inputDf.at[index, 'Pacing - Start Date'] = start_date
inputDf.at[index, 'Pacing - End Date'] = end_date
inputDf.at[index, 'Goal'] = goal
inputDf.at[index, 'Target (Impr/Spend/Views)'] = target_impr_per_spendviews
# Filter inputDf for rows with extracted information
filteredDf = inputDf.dropna(subset=['Pacing - Start Date', 'Pacing - End Date', 'Goal', 'Target (Impr/Spend/Views)'])
# Define the columns to be included in the output DataFrame
cols = [
'Campaign',
'Account',
'Pacing - Start Date',
'Pacing - End Date',
'Goal',
'Target (Impr/Spend/Views)'
]
# Create output DataFrame with the selected columns
outputDf = filteredDf[cols].copy()
# Print the output DataFrame to check the extracted information
print("Output DataFrame with extracted information:")
print(outputDf)
Post generated on 2024-11-27 06:58:46 GMT