PDAP Newsletter

Archives
Subscribe

PDAP Newsletter

Archive

PDAP Newsletter | Big grant news!

Hi there,

Thanks for subscribing for email updates from the Police Data Accessibility Project!

I write with good news: we received a grant of $250,000 from The Heinz Endowments! This year’s arc is fundamentally changed—we’re closer than ever to making a positive impact on our systems and communities using better police data. You can find details in our blog post here.

The ask: this grant will take us to new heights, but doesn’t cover our whole budget. Contribute here to help us cover things like legal fees, grant writing help, and infrastructure. Help us turn volunteers into paid employees! We’re tiny, and just getting started, so a little goes a long way. If you can’t donate, consider sharing this with a friend.

#15
May 20, 2022
Read more

PDAP Newsletter | First Grant Awarded

🎉🎉🎉

On May 12, we received some excellent news: we've been awarded a grant of $250,000 by The Heinz Endowments to make progress on our app. Until now, most work on PDAP (except for Data Bounties) has been done pro bono by a small cast of folks. We've been lucky to receive help from talented, generous volunteers, and we're proud of our steady progress.

The short version: we get to hire staff to work on PDAP.

Grant details

#13
May 17, 2022
Read more

PDAP Newsletter | New Year Recap

Happy New Year!

In brief, the story is this: we have a small group of people making excellent progress toward our mission—but this is a big project. If you're reading this, you can help us.

What happened in 2021?

  • We added dozens of new Scrapers to the repo.

  • We achieved non-profit status, opening the door for funding and grant opportunities.

  • We ran our first DoltHub bounty, making real progress on our database of police Datasets.

  • We launched a simple app to help contributors write new Scrapers.

    • We started making Instagram posts. Go follow us!

  • We had dozens of open working sessions. Some of them were quiet, with one or two people making progress. Some of them were brainstorms, where people disagreed and learned from each other. These happen in Discord, and you should join one! Look for the calendar icon and "Events" in the upper left.

#12
January 2, 2022
Read more

PDAP Newsletter | 501c3 Approval

We did it!

In January, we filed for 501c3 status with the IRS. On August 14, 2021, it was accepted! Police Data Accessibility Project Inc is now a registered nonprofit. Here's our guidestar profile.

The biggest change is that your donations may be tax deductible!

What's next?

#11
September 18, 2021
Read more

PDAP Newsletter | Bounty Retro

Background

Our first Data Bounty with DoltHub was intended to give data scrapers a cash reward for their submissions to public data. They funded and supported the endeavor—thank you Tim, Katie, and the rest of the DoltHub team for giving our project this huge leap forward!

Gains

Our goal was to complete our Agencies table, which is a critical piece of infrastructure. It allows us to draw a line directly from a department's website to the processed, accessible police data that was collected from it.

#10
July 14, 2021
Read more

PDAP Newsletter | Slack → Discord

For a variety of reasons, we have decided to archive our old Slack channel and merge with our existing Discord! There was no need to have two messaging channels for the same project, this consolidates into one. While some Slack features—such as threads—will be missed, the much-more-used Discord will be a fantastic new home. We will be spending the next few days making adjustments to make the community more welcoming and help guide new users through this new experience!

#9
June 7, 2021
Read more

PDAP Newsletter | Bounty Update 4

  • Our first dolt bounty PR for agencies has been accepted with over 68,000 edits!!

  • Katie (Dolt) has a bash script that currently that clones the repo and will check for whitespaces, duplicates and some general domain keyword searching to ensure someone is not passing a fake URL just to get an edit count.

    • She will turn it into a python script with unit-testing, and also fledge it out for checking datasets as well. This script will be useful for us also as we can use it as part of our own pipeline for PR verification in the future!

#7
June 2, 2021
Read more

PDAP Newsletter | Bounty Update 3

  • Still waiting for when we actually start merging PRs for the bounty, Katie (Dolt) has a script she will use for verifying data integrity of bounty PRs (see edit at bottom). If good, PRs will start being merged tomorrow!

  • At the time of writing this post, we have 8 open PRs for the bounty that reflect:

    • over 50,000 row updates on the agencies table (populating lat/lng, city, zip, fips & homepage_url)!

    • 111 new datasets!

    • 4 new data types!

  • The Dolt team is aware of a bug that prevents NULL from being inserted in the csv import and are looking into it now (see attached)

  • As mentioned in the above thread, I removed the UNIQUE constraint on URLs

Not bounty related:

  • We have a few bounty participants that are very interested in actually loading the data and is hoping for another bounty for data-intake

  • One participant, Alexis, has a dolthub repo here with data she has scraped from the FBI for missing persons / wanted persons along with source code here! An excellent addition that we will look into!

#6
June 1, 2021
Read more

PDAP Newsletter | Dolt → PostgreSQL

ichard was hard at work and created a foreign data wrapper (FDW) to access a cloned dolt instance and copy the data into our own PostgreSQL instance that our applications run on.

The current implementation is that Dolt fires information over a webhook when a branch is pushed / merged. We can use this information to see exactly when a PR is merged into master and trigger a dolt pull and restart of our dolt sql-server instance. Then, a stored procedure will activate to load the new data into our PostgreSQL instance so we always have a stable, up-to-date, local copy.

#6
May 28, 2021
Read more

PDAP Newsletter | Bounty Update 2

Our bounty is off to an exciting start! We already have one PR where someone has essentially filled out the entire agencies table (14,711 rows modified!) with city, zip, fips, lat, lng, and homepage_url. We are continuing to work with the Dolt team and will soon be accepting the data into our master branch!

#5
May 28, 2021
Read more

PDAP Newsletter | Dolt Bounty Start

As of 13:31 (PST) / 16:31 (EST), our Dolt Bounty is officially live! Anyone who adds data into the datasets table will get $0.33 per row (maximum cap is $5,000 for the entire bounty). The bounty is running until July 7th @ 1500 (PST) / 1800 (EST)! You can find more information here about the bounty.

  • Only Dolt bounty related PRs can be approved (so we cannot make our own commits or PRs to master for the next 6 weeks)

  • Only Katie (Dolt Team) can make the PR approvals on the pdap/datasets repo during the active bounty (she needs to use a special internal framework to attribute credit to the bounties)

  • If we to do any schema changes or PRs, we will have to create a separate branch and hold all the changes there. It cannot be merged into master until the bounty is over

  • It will be very beneficial to have us monitor the #data-bounties channel in their Discord. If they have any questions about our dataset, it will help us understand how to improve.

#4
May 27, 2021
Read more

PDAP Newsletter | ETL Prototype

The GitHub PR is here.

The data that was loaded is from the USA/CA/butte_county/college/chico scraper. I chose this as a starting point because it has two different types of data, with two differing formats. This allowed me to verify the library reads from the schema.json properly and can load and map no matter the data output. You can find that PR here. It also created 2 new datasets here.

We currently have it set to not auto-commit so someone reviews before committing each time. But it does load data from files, use the schema.json file and (mostly) works!

Current Process:

#3
May 26, 2021
Read more

PDAP Newsletter | DoltHub Bounty

Volunteers gather or "scrape" data for PDAP, but we're running a 6-week data bounty generously sponsored by Dolt. The bounty pays people to gather traceable, approved data. Since we can focus effort by putting a price on specific data, we're able to get a massive head start on our Dataset Catalogue.

From there, we'll be able to scale and consolidate our scraping efforts into an app that can be run to gather data from any dataset. Every time a scraper is run, we add more data to the database!

The current bounty will get us a list of URLs for thousands of police agencies on our Agencies and Datasets tables. Read more about it here.

#2
May 20, 2021
Read more

PDAP Newsletter | Alpha App Launched

We published a Django app that shows a map of the US and when you click a state, it will show all the agencies on the map. Right now, it connects to a PostgreSQL instance using static Dolt data. Future plans:

  • getting a sync from dolt for the data

  • fleding it out to show more information about the current statuses / datasets for each agnecy

  • having a way to download the data easily

  • spruce up the intake tools to aid in importing data into Dolt

  • spruce up the UI (probably align with our Gatsby front-end framework)

#1
May 7, 2021
Read more
  Newer archives
Powered by Buttondown, the easiest way to start and grow your newsletter.