Recruitment

Dystopian factor: moderate

Recruitment is a process that involves extensive writing from candidates and a significant amount of reading from recruiters. Candidates strive to improve the quality of their answers, and reduce the time it takes to complete an application. Recruiters aim to read and remember every application, but this becomes impossible when faced with a high volume of applications.

LLMs read and write well, so it's fair to assume they will be increasingly used by both candidates and recruiters. Any intervations we think of where AI could be helpful for recruiters will be relevant across all industries. There is also potential for partnership with the recruitment platform Applied (which we have relations with, and who are also thinking about these things).


research

Candidate ranking

A lot of effort goes into objectively scoring each answer from a job application e.g. on a scale from 1-5. These scores are averaged across three human reviewers, and the top candidates are selected for interview.

The scores aren't used for anything else, and the process is time consuming. All we really need is the order of candidates, starting with the most promising candidates for the role. With a ranked list, it would be relatively quick for a human to go through the bottom half to filter out candidates who are clearly not suitable.

I'm interested in a couple of questions:

  1. Ignoring AI, would pairwise comparison be a better approach to ranking the candidates, rather than scoring each answer individually?
  2. This sounds scary but.. could an LLM reliably do pairwise comparison for us?

Pairwise comparison is where a reviewer is presented with two answers to the same question from two different candidates, and asked to choose which one is better. By repeating this process on a large number of pairs, we can build up a ranking of all candidates.

I have application data from two similar roles - one posted pre-ChatGPT, one after. I intend to see how ranking by Claude 2 compares to the human ranking, and to see if this can be beaten by a custom GPT model which has been trained on the human scorings.


research

ChatGPT detector

Some job ads are receiving twice as many applications as last year. It's assumed that this is due to candidates completing applications en masse by copy-pasting responses from ChatGPT. There are two phenomena observed by recruiters:

  1. Candidate responses look competant at first glance but lack substance, making it hard to filter out poorly suited candidates.
  2. Candidate responses are actually high quality, but the candidates turn out to be incompetent at interview.

Recruiters may have to change the way they pose questions in applications to make it harder for ChatGPT to give a convincing answer off-the-bat. But is there anything we can do to help recruiters detect ChatGPT responses? It can't be done definitively, but we may be able to apply a warning label to the laziest copy-paste responses.

With the same application data used for candidate ranking above, I intend to answer the pre-ChatGPT application questions myself with ChatGPT, multiple times. I'll then calculate embeddings for each answer, and use this to quantitatively compare how similar the ChatGPT answers are to each other, vs the human answers. It's ambitious but if successful this approach could be used to apply a similarity score to applicant answers, and flag them if they surpass some threshold value.