Script 685: Dimensions from Campaign Name

Purpose

Extract information from a campaign name and add it as columns to a DataFrame.

To Elaborate

The Python script solves the problem of extracting specific information from a campaign name and adding it as columns to a DataFrame. The campaign name is expected to follow a specific pattern, and the script uses regular expressions to extract the relevant information. The extracted information includes the start date, end date, goal, and target impressions/spend/views. The script then adds these extracted values as columns to the input DataFrame. The script also filters the DataFrame to include only rows with extracted information and creates an output DataFrame with the selected columns. The output DataFrame is then printed to check the extracted information.

Walking Through the Code

  1. The script defines column constants for the input DataFrame.
  2. The script defines a function extract_info_from_campaign_name_enhanced that takes a campaign name as input and uses regular expressions to extract the start date, end date, goal, and target impressions/spend/views from the campaign name.
  3. The script adds columns for the extracted information to the input DataFrame.
  4. The script iterates over each row in the input DataFrame and calls the extract_info_from_campaign_name_enhanced function to extract the information from the campaign name. If all the extracted information is available, it updates the corresponding columns in the input DataFrame.
  5. The script filters the input DataFrame for rows with extracted information and assigns the filtered DataFrame to filteredDf.
  6. The script defines a list of columns to be included in the output DataFrame.
  7. The script creates the output DataFrame by selecting the columns from the filteredDf DataFrame.
  8. The script prints the output DataFrame to check the extracted information.

Vitals

  • Script ID : 685
  • Client ID / Customer ID: 1306927167 / 60270139
  • Action Type: Bulk Upload (Preview)
  • Item Changed: Campaign
  • Output Columns: Account, Campaign, Goal, Pacing - End Date, Pacing - Start Date, Target (Impr/Spend/Views)
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: ascott@marinsoftware.com (ascott@marinsoftware.com)
  • Created by ascott@marinsoftware.com on 2024-02-08 20:19
  • Last Updated by ascott@marinsoftware.com on 2024-03-08 19:12
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
## 
## name: Dimension Tags from Campaign Name
## description:
##  
## 
## author: 
## created: 2023-12-04
## 

# Column Definitions
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_PACING_START_DATE = 'Pacing - Start Date'
RPT_COL_PACING_END_DATE = 'Pacing - End Date'
RPT_COL_GOAL = 'Goal'
RPT_COL_TARGET_IMPR_PER_SPENDVIEWS = 'Target (Impr/Spend/Views)'

# Function to extract information from the campaign name with enhanced logic
def extract_info_from_campaign_name_enhanced(campaign_name):
    pattern = r'(.*?)[-_]\s*([\d]{1,2}[\-/][\d]{1,2}[\-/][\d]{4})\s*[-_]\s*([\d]{1,2}[\-/][\d]{1,2}[\-/][\d]{4})\s*[-_]\s*(MS|MS |CPM|CPM |CPV|CPV )\s*[-_]\s*([\d,]+(?:\.\d+)?)'
    match = re.search(pattern, campaign_name)
    if match:
        start_date_str, end_date_str, goal, target_impr_per_spendviews = match.groups()[1:]

        def convert_date(date_str):
            try:
                return datetime.datetime.strptime(date_str, '%m/%d/%Y').date()
            except ValueError:
                return None

        start_date = convert_date(start_date_str)
        end_date = convert_date(end_date_str)

        return start_date, end_date, goal, target_impr_per_spendviews
    else:
        return None, None, None, None

# Adding columns for extracted information to inputDf
for column_name in ['Pacing - Start Date', 'Pacing - End Date', 'Goal', 'Target (Impr/Spend/Views)']:
    inputDf[column_name] = np.nan

# Process each row in inputDf to extract information
for index, row in inputDf.iterrows():
    start_date, end_date, goal, target_impr_per_spendviews = extract_info_from_campaign_name_enhanced(row[RPT_COL_CAMPAIGN])
    if start_date and end_date and goal and target_impr_per_spendviews:
        inputDf.at[index, 'Pacing - Start Date'] = start_date
        inputDf.at[index, 'Pacing - End Date'] = end_date
        inputDf.at[index, 'Goal'] = goal
        inputDf.at[index, 'Target (Impr/Spend/Views)'] = target_impr_per_spendviews

# Filter inputDf for rows with extracted information
filteredDf = inputDf.dropna(subset=['Pacing - Start Date', 'Pacing - End Date', 'Goal', 'Target (Impr/Spend/Views)'])

# Define the columns to be included in the output DataFrame
cols = [
    'Campaign',
    'Account',
    'Pacing - Start Date',
    'Pacing - End Date',
    'Goal',
    'Target (Impr/Spend/Views)'
]

# Create output DataFrame with the selected columns
outputDf = filteredDf[cols].copy()

# Print the output DataFrame to check the extracted information
print("Output DataFrame with extracted information:")
print(outputDf)


Post generated on 2024-05-15 07:44:05 GMT

comments powered by Disqus