• blakestacey@awful.systemsM
    link
    fedilink
    English
    arrow-up
    20
    ·
    10 months ago

    Shot, in the post:

    Gina and I eventually decided that the data collection process was too time-consuming, and we stopped partway through.

    Chaser, from the comments:

    Josh You and I wrote a python script that searches Google for a list of keywords, saves the text of the web pages in the search results, and shows them to GPT and asks it questions about them from a prompt. This would quickly automate the rest of your data collection

    • carlitoscohones@awful.systems
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      10 months ago

      the data collection process was too time-consuming

      Just to show how time-consuming this process might have been, it consisted of two people doing google searches and assigning the names them to a handful of categories.

      1 - I copied the list of signatories from their website. 2 -Gina Stuessy and I searched the internet for “(name) lawsuit”, “(name) crime” and also looked at their Wikipedia page. 3 -I categorized any results into “financial”, “sexual”, and “other”, and also marked if they had spent at least one day in jail. 4 -Gina and I eventually decided that the data collection process was too time-consuming, and we stopped partway through. The final dataset includes 115 of the 232 signatories.[2][3]