UPDATE:
Had to
/UPDATE
I've been doing this for a long, long time -- Googling, more or less, that is, searching the internet for documents, mostly about surveillance and public safety, but really covering a broad range of subjects. Sometimes when you're experimenting with a search you don't have subject-matter in mind at all, or rather, subject matter isn't reflected in any key words.
Had to
/UPDATE
I've been doing this for a long, long time -- Googling, more or less, that is, searching the internet for documents, mostly about surveillance and public safety, but really covering a broad range of subjects. Sometimes when you're experimenting with a search you don't have subject-matter in mind at all, or rather, subject matter isn't reflected in any key words.
Having done this for double-long, time-wise, I have collected easily terabytes of data, and probably currently have 1 TB of this material on my machines -- perhaps 50 gigabytes is of use or interest, after removing repeats and garbage files.
Some of it is pretty sensitive, some I know is classified, and there remain many whole directory trees that may contain Top Secret information for all I know, I have not gotten around to parsing the corpus.
I've already posted an invitation to researchers of all stripes to dig in to the documents -- I know some of this is important, and I can't just sit on it.
I quickly received a response from a researcher who I'm familiar with, Stacie, who worked on Project PM, and who I'd be pleased to have dissecting this trove.
The thing is, while Stacie very generously offered to review anything I wanted to send, I want to send all of it, but the size of just this first section makes doing that extremely tedious (I will shortly see if an archive of the whole cache unzips without corrupting). What I need is requests, so I posted a picture of the directory for people to find items of interest, but I am also realizing that may not be very obvious from the titles of the folders.
I need to tell you a little more about these particular documents....but in a way, that defeats the whole purpose of this exercise.
However, I can try this-
First, I have written a little about some sore thumbs (the items that I found that led me to these full FTP directories of company files).
One company concerned, Excalibur/Covenant Security Solutions, has a checkered history, having been busted helping TSA agents cheat on their exams at SFO in 2002-3.
I've also indexed these documents with a wicked program called dtSearch, so I can query key terms and export the search results, with context from the documents, for you to review. I did one search already, C4ISR. You can comment, email, or tweet your suggestions to me.
I also realized I needed to make the image bigger in the last post. Done and done.
Here it is again.
Comments
Post a Comment