Script 1247: Alert Active Ads with Expiring Dates

Purpose

The Python script identifies advertisements with expired or soon-to-expire application deadlines across various publishers.

To Elaborate

The script is designed to scan advertisements for any expired or nearly expired application deadlines. It processes data from multiple publishers to identify ads that are either past their application deadline or are approaching it within a specified number of days. This is crucial for managing ad campaigns effectively, ensuring that outdated ads are flagged and can be updated or removed promptly. The script operates by checking the ‘Application Deadline’ field in the ad data and comparing it against the current date to determine if an alert should be raised. This helps in maintaining the relevance and accuracy of ad content across platforms.

Walking Through the Code

  1. Initialization and Configuration
    • The script begins by setting up a local mode configuration, which includes paths for downloading and loading data files. It checks whether the code is running on a server or locally, and loads the necessary data accordingly.
    • User-configurable parameters include WITHIN_DAYS_TO_DEADLINE, which determines how many days before the application deadline an alert should be triggered.
  2. Data Processing and Alerts
    • The script ensures that the ‘Application Deadline’ column in the data is in the correct datetime format.
    • It calculates a deadline check date by adding the user-defined number of days to the current date.
    • The script then identifies ads with deadlines that are either within the specified alert period or have already passed, and prepares an output DataFrame containing these ads.
    • If running in local development mode, the script writes the output and debug data to CSV files for further analysis.

Vitals

  • Script ID : 1247
  • Client ID / Customer ID: 1306926629 / 60270083
  • Action Type: Email Report
  • Item Changed: None
  • Output Columns:
  • Linked Datasource: M1 Report
  • Reference Datasource: None
  • Owner: Michael Huang (mhuang@marinsoftware.com)
  • Created by Michael Huang on 2024-07-01 02:58
  • Last Updated by Michael Huang on 2024-07-18 00:55
> See it in Action

Python Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
##
## name: Alert on Ads with expiring dates
## description:
##  * Alerts on expired or nearly expired Ads via the "Application Deadline" dimension
##.
## author: Michael S. Huang
## created: 2024-07-01
## 


########### START - Local Mode Config ###########
# Step 1: Uncomment download_preview_input flag and run Preview successfully with the Datasources you want
download_preview_input=False
# Step 2: In MarinOne, go to Scripts -> Preview -> Logs, download 'dataSourceDict' pickle file, and update pickle_path below
# pickle_path = ''
pickle_path = '/Users/mhuang/Downloads/pickle/all_campus_expiring_ads_alert_20240711.pkl'
# Step 3: Copy this script into local IDE with Python virtual env loaded with pandas and numpy.
# Step 4: Run locally with below code to init dataSourceDict

# determine if code is running on server or locally
def is_executing_on_server():
    try:
        # Attempt to access a known restricted builtin
        dict_items = dataSourceDict.items()
        return True
    except NameError:
        # NameError: dataSourceDict object is missing (indicating not on server)
        return False

local_dev = False

if is_executing_on_server():
    print("Code is executing on server. Skip init.")
elif len(pickle_path) > 3:
    print("Code is NOT executing on server. Doing init.")
    local_dev = True
    # load dataSourceDict via pickled file
    import pickle
    dataSourceDict = pickle.load(open(pickle_path, 'rb'))

    # print shape and first 5 rows for each entry in dataSourceDict
    for key, value in dataSourceDict.items():
        print(f"Shape of dataSourceDict[{key}]: {value.shape}")
        # print(f"First 5 rows of dataSourceDict[{key}]:\n{value.head(5)}")

    # set outputDf same as inputDf
    inputDf = dataSourceDict["1"]
    outputDf = inputDf.copy()

    # setup timezone
    import datetime
    # Chicago Timezone is GMT-5. Adjust as needed.
    CLIENT_TIMEZONE = datetime.timezone(datetime.timedelta(hours=-5))

    # import pandas
    import pandas as pd
    import numpy as np

    # other imports
    import re
    import urllib

    # import Marin util functions
    # from marin_scripts_utils import tableize, select_changed

    # pandas settings
    pd.set_option('display.max_columns', None)  # Display all columns
    pd.set_option('display.max_colwidth', None)  # Display full content of each column

else:
    print("Running locally but no pickle path defined. dataSourceDict not loaded.")
    exit(1)
########### END - Local Mode Setup ###########

today = datetime.datetime.now(CLIENT_TIMEZONE).date()

# primary data source and columns
inputDf = dataSourceDict["1"]
RPT_COL_CREATIVE_TYPE = 'Creative Type'
RPT_COL_PUBLISHER = 'Publisher'
RPT_COL_ACCOUNT = 'Account'
RPT_COL_CAMPAIGN = 'Campaign'
RPT_COL_GROUP = 'Group'
RPT_COL_CREATIVE_ID = 'Creative ID'
RPT_COL_STATUS = 'Status'
RPT_COL_IMPR = 'Impr.'
RPT_COL_APPLICATION_DEADLINE = 'Application Deadline'
RPT_COL_DESCRIPTION_LINE_1 = 'Description Line 1'


########### User Configurable Param ##########
# alert days before application deadline
WITHIN_DAYS_TO_DEADLINE = 7
##############################################


#### User Code Starts Here

print("inputDf.shape", inputDf.shape)


# Ensure the 'Application Deadline' column is in datetime format
inputDf[RPT_COL_APPLICATION_DEADLINE] = pd.to_datetime(inputDf[RPT_COL_APPLICATION_DEADLINE], errors='coerce')

# Alert: 'Application Deadline' is within DAYS_BEFORE_EXPIRY days after today
deadline_check_date = today + datetime.timedelta(days=WITHIN_DAYS_TO_DEADLINE)
deadline_check_date_pd = pd.to_datetime(deadline_check_date)
today_pd = pd.to_datetime(today)
deadline_alert =  (inputDf[RPT_COL_APPLICATION_DEADLINE] > today_pd) & (inputDf[RPT_COL_APPLICATION_DEADLINE] <= deadline_check_date_pd)

# Alert: 'Application Deadline' has already passed
passed_deadline = (inputDf[RPT_COL_APPLICATION_DEADLINE] <= today_pd)

# Prepare output
outputDf = inputDf[deadline_alert | passed_deadline]

print("outputDf.shape", outputDf.shape)

debugDf = inputDf

### local debug

if local_dev:
    output_filename = 'outputDf.csv'
    outputDf.to_csv(output_filename, index=False)
    print(f"Local Dev: Output written to: {output_filename}")

    debug_filename = 'debugDf.csv'
    debugDf.to_csv(debug_filename, index=False)
    print(f"Local Dev: Debug written to: {debug_filename}")



Post generated on 2024-11-27 06:58:46 GMT

comments powered by Disqus