SC Judicial Department's Transparency Woes

SC Judicial Department's Transparency Woes

Accidental obsession

If you've been following me for a bit, you'll know one of the things I've really enjoyed is scraping data out of different government systems. I became obsessed with South Carolina's Supreme Court after they issued a decision striking down Governor McMaster's creation of a private school voucher program utilizing $23 million dollars of the SC's $48 million cut of the CARES act era Governor's Emergency Education Relief Fund. They ruled the program violated the SC constitution. That decision is worth a read! What's truly wild to me is that during the 2020-2021 school year, the private school headcount was 33,492.[1] and the public school head count was 761,290[2] . That's means nearly half of the money from the GEER program would have gone to 4.4% of SC's students. Thank God, the court stopped that program in its tracks. But I'm getting distracted here, this is a blog about data in South Carolina, not privatization of schools! Steve Nuzum's Substack is great if you're into that.

Voucher Proponents Across the Country are Winning. What Are the Stakes?
Across the country, states like Utah and Iowa are getting closer to passing school voucher legislation, while many others have already done so. According to Education Week, “lawmakers in at least 11 states—Idaho, Iowa, Kansas, Missouri, Nebraska, North Dakota, Oklahoma, South Carolina, Texas, Utah,…
Post from Other Duties as assigned about school vouchers

Anyway, that SC Supreme Court decision got more media coverage than most of the other one's do and I started paying more attention to the court. They issued a series of opinions regarding mask mandates and that lead me thinking about creating a twitter bot that would post opinions as quickly as they were posted on the Court's website.

I was bored

A few days after I got the code written it started posting opinions, and I was ecstatic.

Some twitter bots are actually good!

The bot ending up posting some fairly notable opinions like the decision ruling portions of the Heritage Act unconstitutional and the decision that held the "Heartbeat" bill unconstitutional. I also added tweets for orders issued by the court. Overall, this project was a resounding success and it only took me about an hour and a half to write the code that drives it.

The problem

I suppose that's one of the reasons why Avery Wilks, formerly of the POst and Courier, reached out to me last year and asked if there was a way to scrape the Trial Court Public Index system. I told him it was possible! Little did I know that the Public Index is run by the Judicial Department, and they really don't like web scraping or transparent systems.

After a couple of minutes of chatting with Avery, I ran into a problem:

a screen capture of a portion of the SC judicial department's public index system that says data scraping is prohibited
Data scrapers are prohibited!

Prohibiting data scraping and automated use of a system is annoying, but it's especially annoying for this system is separated by county so if you want to look up case records for a specific defendant in multiple counties, you're going to have to look them up county-by-county. Even worse, if you've got a defendant who was a party in multiple cases, like Alex Murdaugh, you might even have to figure out the individual case numbers in order to make the system easier to use because it limits the search output to 196 results instead of breaking it up into pages like most search tools do.

On top of all that the system utilizes a software tool called the Imperva Web Application Firewall (WAF)l which uses all kinds of tools to prevent users from scraping a site. Some of the methods the WAF uses are so harsh that it will lock out legitimate users from using the site for a period of time, simply because they made too many requests.

NAACP v. Kohn et al

After discovering all of this, I was extremely irritated that documents of public interest were so inaccessible and I did some Googling and found an ongoing federal court case in which the SC chapter of National Association for the Advancement of Colored People (NAACP) is suing the SC Judicial Department over this exact issue.

The NAACP represents many folks who are facing eviction from their homes and they claim they are unable to use the Public Index to find eviction notices within the ten day period in which tenants in SC are allowed to request a hearing on a notice. The NAACP asserts they have a first amendment right to obtain these notices and the judicial department's protections violate that right.

I tend to agree. This system exists to serve the public so why can't we scrape it if we want to? Basically, it boils down to the court has a very outdated system and is unwilling to modernize it so they've resorted to somewhat exotic protection methods such as using AI driven tools, as indicated in court filings from the judicial department.[3] It blows my mind that its easier / more efficient to build protection methods on top of a system than to modernize it but that's really the path the Judicial Administration chose.

It's worth noting that this paragraph:

Access to the South Carolina Judicial Department Public Index web sites by a site data scraper or any similar software intended to discover and extract data from a website through automated, repetitive querying for the purpose of collecting such data is expressly prohibited.

appears to have been added back when Jean Toal was the Chief Justice. [4]

Most of the information released publicly about the protections in place come from the affidavit of Joel Hilke, the department's security architect. He gets pretty technical in there but it's worth reading just to understand what the department perceives as a threat to the Public Index.

The key takeaway is that the department has viewed scrapers as a threat since the public index system went live but so far they haven't had to produce evidence showing just exactly how large that threat is. I submitted a public records request asking the department to produce web server logs for the system dating back to 2016 and they unsurprisingly denied my request.

an email requesting web server logs of the public index system
I wish this had worked but I was not hopeful
a response form the SC Court administration denying my request
Wow, I'm so surprised...

The filings by the judicial department in this case state that there are alternative methods for obtaining bulk data from the case records system and there are but requests must be mailed and processed by the department. Additionally, this method can cost thousands of dollars for requesters.[5]

So what's next?

This federal case is scheduled to go trial after October 31, 2023. Much to the chagrin of the Judicial Department. The judge in the case denied the defendant's motion to dismiss and that was a really big win the NAACP. I'm very eager to see how this proceeds. There have been cases about the legality of scraping before but never a case about whether prevention of scraping violates the first amendment. And it's happening in South Carolina of all places. Bravo NAACP, I hope y'all win this.

  1. SC Dept. of Education Private School Headcount (2020-2021) ↩︎

  2. SC Dept. of Education Public School Headcount (2020-2021) ↩︎

  3. Affidavit of Joel Hilke ↩︎

  4. Jean Toal automation quote ↩︎

  5. Request for Bulk or Compiled Data ↩︎

Thanks for reading! Here are some of the things I enjoyed reading this week.

The new statehouse liturgies
We need more words and acts of defiance like in Nebraska, Tennessee, and Kentucky
I have some questions for Elon Musk
If Elon is going to platform Nazis and war criminals, he needs to own those decisions.
Everything Online Malign Influence Newsletter
April 17, 2023 from Hoaxlines Lab