Script 805: Pacing Dimension Value from Campaign Name

Purpose:

The Python script extracts specific information from campaign names and organizes it into a structured format for further analysis.

To Elaborate

The script is designed to parse campaign names to extract key details such as start and end dates, goals, and target metrics. These details are then used to populate a structured DataFrame, which facilitates easier analysis and reporting. The script assumes that the campaign names follow a specific pattern, which includes dates and metrics separated by delimiters. By extracting this information, the script helps in organizing and structuring data for better insights into campaign performance and planning.

Walking Through the Code

  1. Function Definition:
    • The script defines a function extract_info_from_campaign_name_enhanced that uses a regular expression to parse campaign names. It extracts the start date, end date, goal, and target metrics if they match the expected pattern.
    • The function includes a nested convert_date function to convert date strings into datetime objects, handling potential errors gracefully.
  2. DataFrame Preparation:
    • New columns are added to the inputDf DataFrame to store the extracted information: ‘Pacing - Start Date’, ‘Pacing - End Date’, ‘Goal’, and ‘Target (Impr/Spend/Views)’.
  3. Data Extraction:
    • The script iterates over each row in inputDf, applying the extraction function to the ‘Campaign’ column.
    • If the extraction is successful, the corresponding row in inputDf is updated with the extracted values.
  4. Filtering and Output:
    • Rows with successfully extracted information are filtered into a new DataFrame, filteredDf.
    • A subset of columns is selected to create the outputDf, which is then printed to verify the extracted information.

Vitals

  • Script ID : 805
  • Client ID / Customer ID: 1306927731 / 60270139
  • Action Type: Bulk Upload
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Goal, Pacing - End Date, Pacing - Start Date, Target (Impr/Spend/Views)
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: ascott@marinsoftware.com (ascott@marinsoftware.com)
  • Created by ascott@marinsoftware.com on 2024-03-13 17:57
  • Last Updated by ascott@marinsoftware.com on 2024-03-13 18:28
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
## name: Dimension Tags from Campaign Name
## description:
##  
## 
## author: 
## created: 2023-12-04
## 

# Column Definitions
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_PACING_START_DATE = 'Pacing - Start Date'
RPT_COL_PACING_END_DATE = 'Pacing - End Date'
RPT_COL_GOAL = 'Goal'
RPT_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'

# Function to extract information from the campaign name with enhanced logic
def extract_info_from_campaign_name_enhanced(campaign_name):
    pattern = r'(.*?)[-_]\s*([\d]{1,2}[\-/][\d]{1,2}[\-/][\d]{4})\s*[-_]\s*([\d]{1,2}[\-/][\d]{1,2}[\-/][\d]{4})\s*[-_]\s*(MS|MS |CPM|CPM |CPV|CPV )\s*[-_]\s*([\d,]+(?:\.\d+)?)'
    match = re.search(pattern, campaign_name)
    if match:
        start_date_str, end_date_str, goal, target_impr_per_spendviews = match.groups()[1:]

        def convert_date(date_str):
            try:
                return datetime.datetime.strptime(date_str, '%m/%d/%Y').date()
            except ValueError:
                return None

        start_date = convert_date(start_date_str)
        end_date = convert_date(end_date_str)

        return start_date, end_date, goal, target_impr_per_spendviews
    else:
        return None, None, None, None

# Adding columns for extracted information to inputDf
for column_name in ['Pacing - Start Date', 'Pacing - End Date', 'Goal', 'Target (Impr/Spend/Views)']:
    inputDf[column_name] = np.nan

# Process each row in inputDf to extract information
for index, row in inputDf.iterrows():
    start_date, end_date, goal, target_impr_per_spendviews = extract_info_from_campaign_name_enhanced(row[RPT_COL_CAMPAIGN])
    if start_date and end_date and goal and target_impr_per_spendviews:
        inputDf.at[index, 'Pacing - Start Date'] = start_date
        inputDf.at[index, 'Pacing - End Date'] = end_date
        inputDf.at[index, 'Goal'] = goal
        inputDf.at[index, 'Target (Impr/Spend/Views)'] = target_impr_per_spendviews

# Filter inputDf for rows with extracted information
filteredDf = inputDf.dropna(subset=['Pacing - Start Date', 'Pacing - End Date', 'Goal', 'Target (Impr/Spend/Views)'])

# Define the columns to be included in the output DataFrame
cols = [
    'Campaign',
    'Account',
    'Pacing - Start Date',
    'Pacing - End Date',
    'Goal',
    'Target (Impr/Spend/Views)'
]

# Create output DataFrame with the selected columns
outputDf = filteredDf[cols].copy()

# Print the output DataFrame to check the extracted information
print("Output DataFrame with extracted information:")
print(outputDf)


Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus