Data Scientist, Traffic Quality
ADCI - Karnataka•4h ago
IN, KA, BengaluruOnsiteFull-timeMid Level2+ yrs exp
Top focus
Data ScientistVp Data
- Amazon Ads is a multi-billion dollar global business that delivers advertising experiences across Amazon's owned-and-operated properties (including Prime Video, Twitch, Fire TV
- Amazon.com), third-party publisher networks
- emerging channels like generative AI-powered shopping experiences. As one of the fastest-growing segments of Amazon, we operate at unprecedented scale across desktop, mobile, connected TV
- emerging surfaces. Within Amazon Ads, Traffic Quality is a critical pillar of advertiser trust and marketplace integrity. Our mission is to build advanced capabilities that work at petabyte scale to detect sophisticated invalid traffic (IVT) which includes sophisticated non-human traffic, bot networks
- fraudulent engagement patterns across programmatic advertising. We are on a journey to establish Amazon Ads as an industry leader in traffic quality standards and transparency. Our research agenda focuses on staying ahead of adversarial actors through continuous innovation in detection methodologies, leveraging state-of-the-art techniques in deep learning and generative modeling, user behavior and multi-modal representation learning, anomaly detection, time-series analysis
- sparse labeling methods. We process billions of ad events daily, developing novel algorithms that balance precision and recall while operating under strict latency constraints. Our work directly protects hundreds of millions of dollars in advertiser spend annually while maintaining a seamless user experience. Key job responsibilities As a Data Scientist II in Traffic Quality, you will solve inherently hard problems in advertising fraud detection by applying advanced statistical techniques and machine learning. You'll work on systems that process billions of ad impressions and clicks per day, using Amazon's cloud services including EC2, S3, EMR, Sagemaker
- RedShift. - Define and frame new research problems in fraud detection where neither problem nor solution is well-defined. - Apply new machine learning approaches, models
- algorithms to detect sophisticated invalid traffic. - Apply domain knowledge to perform broad data analysis as a precursor to modeling and build business insights. - Work with unstructured and massive datasets to deliver results. - Produce research reports meeting top-tier external publication standards. - Mentor and develop junior scientists on the team. About the team Here are a few papers published by the team: 1/ [Scaling Generative Pre-training for User Ad Activity Sequences. AdKDD 2023.](https://assets.amazon.science/b7/42/03be071743d5a57cb1656e6caa34/scaling-generative-pre-training-for-user-ad-activity-sequences.pdf) 2/ [SLIDR: Real-time Robot Detection On Online Ads, IAAI 2023, Deployed Highly Innovative Applications of AI Track (AAAI 2023)](https://assets.amazon.science/75/2f/3b7106b143f38f7f4d2806388ace/real-time-detection-of-robotic-traffic-in-online-advertising.pdf) 3/ [Self-supervised Representation Learning Across Sequential and Tabular Features Using Transformers, NeurIPS 2022, First Table Representation Learning Workshop](https://openreview.net/forum?id=wIIJlmr1Dsk)
- 2+ years of data scientist experience - 3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience - 3+ years of machine learning/statistical modeling data analysis tools and techniques
- parameters that affect their performance experience - 1+ years of guiding and coaching a group of researchers experience - 1+ years of working with or evaluating AI systems experience - 1+ years of creating or contributing to mathematical textbooks, research papers
- educational content experience - Master's degree in Science, Technology, Engineering
- experience working in Science, Technology, Engineering
- Mathematics (STEM) - Experience applying theoretical models in an applied environment
- Knowledge of machine learning concepts and their application to reasoning and problem-solving - Experience in Python, Perl
- another scripting language - Experience in a ML or data scientist role with a large technology company - Experience in defining and creating benchmarks for assessing GenAI model performance - Experience working on multi-team, cross-disciplinary projects - Experience applying quantitative analysis to solve business problems and making data-driven business decisions - Experience effectively communicating complex concepts through written and verbal communication Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Required skills
PythonSQLMachine LearningStatistical AnalysisData Analysis