Script 1247: Alert Active Ads with Expiring Dates

Purpose:

The script identifies ads with expired or nearly expired application deadlines across all publishers.

To Elaborate

The Python script is designed to scan advertisements for expired or nearly expired application deadlines. It processes data from various publishers to identify ads that are either past their application deadline or are approaching it within a specified number of days. This is crucial for businesses to ensure that their ads are timely and relevant, preventing the display of outdated information. The script allows users to configure the number of days before the deadline to trigger an alert, providing flexibility in managing ad campaigns effectively.

Walking Through the Code

  1. Local Mode Configuration:
    • The script begins with a configuration section for local development, including setting a path for a pickle file containing data and determining if the script is running on a server or locally.
    • If running locally, it loads data from the specified pickle file into a dictionary called dataSourceDict.
  2. Data Preparation:
    • The script imports necessary libraries such as pandas, numpy, and datetime.
    • It sets up pandas display options to show all columns and full content for better visibility during debugging.
  3. User Configurable Parameter:
    • The parameter WITHIN_DAYS_TO_DEADLINE is defined, allowing users to specify how many days before the application deadline an alert should be triggered.
  4. Data Processing:
    • The script ensures that the ‘Application Deadline’ column is in datetime format.
    • It calculates a deadline_check_date by adding the user-defined number of days to the current date.
    • It identifies ads with deadlines within the specified range or those that have already passed.
  5. Output Preparation:
    • The script filters the input data to create an output DataFrame containing only the ads that meet the alert criteria.
    • If running in local development mode, it writes the output and debug data to CSV files for further analysis.

Vitals

  • Script ID : 1247
  • Client ID / Customer ID: 1306926629 / 60270083
  • Action Type: Email Report
  • Item Changed: None
  • Output Columns:
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Michael Huang (mhuang@marinsoftware.com)
  • Created by Michael Huang on 2024-07-01 02:58
  • Last Updated by Michael Huang on 2024-07-18 00:55
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
##
## name: Alert on Ads with expiring dates
## description:
##  * Alerts on expired or nearly expired Ads via the "Application Deadline" dimension
##.
## author: Michael S. Huang
## created: 2024-07-01
## 


########### START - Local Mode Config ###########
# Step 1: Uncomment download_preview_input flag and run Preview successfully with the Datasources you want
download_preview_input=False
# Step 2: In MarinOne, go to Scripts -> Preview -> Logs, download 'dataSourceDict' pickle file, and update pickle_path below
# pickle_path = ''
pickle_path = '/Users/mhuang/Downloads/pickle/all_campus_expiring_ads_alert_20240711.pkl'
# Step 3: Copy this script into local IDE with Python virtual env loaded with pandas and numpy.
# Step 4: Run locally with below code to init dataSourceDict

# determine if code is running on server or locally
def is_executing_on_server():
    try:
        # Attempt to access a known restricted builtin
        dict_items = dataSourceDict.items()
        return True
    except NameError:
        # NameError: dataSourceDict object is missing (indicating not on server)
        return False

local_dev = False

if is_executing_on_server():
    print("Code is executing on server. Skip init.")
elif len(pickle_path) > 3:
    print("Code is NOT executing on server. Doing init.")
    local_dev = True
    # load dataSourceDict via pickled file
    import pickle
    dataSourceDict = pickle.load(open(pickle_path, 'rb'))

    # print shape and first 5 rows for each entry in dataSourceDict
    for key, value in dataSourceDict.items():
        print(f"Shape of dataSourceDict[{key}]: {value.shape}")
        # print(f"First 5 rows of dataSourceDict[{key}]:\n{value.head(5)}")

    # set outputDf same as inputDf
    inputDf = dataSourceDict["1"]
    outputDf = inputDf.copy()

    # setup timezone
    import datetime
    # Chicago Timezone is GMT-5. Adjust as needed.
    CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))

    # import pandas
    import pandas as pd
    import numpy as np

    # other imports
    import re
    import urllib

    # import Marin util functions
    # from marin_scripts_utils import tableize, select_changed

    # pandas settings
    pd.set_option('display.max_columns', None)  # Display all columns
    pd.set_option('display.max_colwidth', None)  # Display full content of each column

else:
    print("Running locally but no pickle path defined. dataSourceDict not loaded.")
    exit(1)
########### END - Local Mode Setup ###########

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CREATIVE_TYPE = 'Creative Type'
RPT_COL_PUBLISHER = 'Publisher'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_CREATIVE_ID = 'Creative ID'
RPT_COL_STATUS = 'Status'
RPT_COL_IMPR = 'Impr.'
RPT_COL_APPLICATION_DEADLINE = 'Application Deadline'
RPT_COL_DESCRIPTION_LINE_1 = 'Description Line 1'


########### User Configurable Param ##########
# alert days before application deadline
WITHIN_DAYS_TO_DEADLINE = 7
##############################################


#### User Code Starts Here

print("inputDf.shape", inputDf.shape)


# Ensure the 'Application Deadline' column is in datetime format
inputDf[RPT_COL_APPLICATION_DEADLINE] = pd.to_datetime(inputDf[RPT_COL_APPLICATION_DEADLINE], errors='coerce')

# Alert: 'Application Deadline' is within DAYS_BEFORE_EXPIRY days after today
deadline_check_date = today + datetime.timedelta(days=WITHIN_DAYS_TO_DEADLINE)
deadline_check_date_pd = pd.to_datetime(deadline_check_date)
today_pd = pd.to_datetime(today)
deadline_alert =  (inputDf[RPT_COL_APPLICATION_DEADLINE] > today_pd) & (inputDf[RPT_COL_APPLICATION_DEADLINE] <= deadline_check_date_pd)

# Alert: 'Application Deadline' has already passed
passed_deadline = (inputDf[RPT_COL_APPLICATION_DEADLINE] <= today_pd)

# Prepare output
outputDf = inputDf[deadline_alert | passed_deadline]

print("outputDf.shape", outputDf.shape)

debugDf = inputDf

### local debug

if local_dev:
    output_filename = 'outputDf.csv'
    outputDf.to_csv(output_filename, index=False)
    print(f"Local Dev: Output written to: {output_filename}")

    debug_filename = 'debugDf.csv'
    debugDf.to_csv(debug_filename, index=False)
    print(f"Local Dev: Debug written to: {debug_filename}")



Post generated on 2025-03-11 01:25:51 GMT

comments powered by Disqus