ChatGPT fails IAS Exam

 ChatGPT fails IAS Exam

(4 min read)

IAS Exam!

The title of this piece is purposely sensational. To entice you to read further. Of course, you know, there is nothing called the IAS exam, but most people nevertheless use that name for the Civil Services Examination (CSE) conducted every year by UPSC for about 1000 Grade I & II posts.

You may have heard the story, possibly apocryphal, of parents of a young man from Bihar who visited the home of a potential bride for their ladla, and the girl’s father asked, ‘Aapke suputra ajkal kya kar rahein hain,’ and the boy’s father replied, with much pride, ‘Shriman Dilli mein IAS kar rahein hain.’ That was truthful, since the young man had joined a coaching institute, and was living in Old Rajinder Nagar in a one room shanty with a shared bathroom for the last several years.

ChatGPT passes tough Tests

ChatGPT, OpenAI’s chatbot, is making news ever since its public release in November 2022. It has bedazzled academics by passing several tough Tests including

·      Wharton Business School’s MBA exam (though with a Grade of B or B-)

·      University of Minnesota’s law exam

·      US Medical Licensing exam, and

·      Google’s interview for a Level 3 engineer for a coding position.

‘ChatGPT on its way to become a virtual doctor, lawyer, or business analyst’ is the title of a piece in a leading business daily.

ChatGPT takes CSE Prelim Exam

It was high time to test the AI tool for the UPSC Civil Services Examination - claimed by the IAS coaching institutes, those who wrote it and succeeded and those who didn’t, as the ‘toughest test’ in the world.

(Don’t believe that tall claim. At the end of this piece is a little Test you may like to take to see how easy it is. Simple, school level math!)

I ran a snap, sample test for the AI chatbot.

Well, ChatGPT didn’t write the Civil Services (Preliminary) Examination 2022 conducted by the UPSC; but I coerced it to take a Test. I picked up five random questions each from the General Studies Paper I and II, and asked ChatGPT to answer.

Why prolong the suspense? Here is the result:

ChatGPT (C) Scores

General Studies Paper I:

·      Total Questions -100, Total Marks – 200, Time – 2 hours

·      Each correct answer has a value of 2 marks

·      Each incorrect answer has a penalty of (-) 0.66

·      Of the 5 Questions asked, C got 4 correct, and 1 incorrect

·      Score: 8 – 0.66 = 7.34

·      Total marks – 10, hence score is 73.4%

General Studies Paper II:

·      Total Questions -80, Total Marks – 200, Time – 2 hours

·      Each correct answer has a value of 2.5 marks

·      Each incorrect answer has a penalty of (-) 0.8

·      Of the 5 Questions asked, C got 2 correct, and 3 incorrect

·      Score: 5 – 2.4 = 2.6

·      Total Marks – 12.5, hence score is 20.8%

Result:

·      Cut off is based on GSP I score only. C scored very high in GSP I, and may be among the highest scorers

·      However, minimum qualifying marks for GSP II is 33%, and hence C failed.


(A painting of ChatGPT robot sweating over a tough question. Painting by Prasanna + DALL-E)

CSE Preliminary 2022- Basic Stats

·      Cut-off: GSP I for General category candidates - 43.5% to 50% in in previous years; likely cut off for CSEP 2022 – 45% to 47.5% (90 to 95 marks in GSP I)

·      Number of candidates who applied: 11.52 lakhs

·      Number of candidates who appeared in Prelim exam: About 8 lakhs

·      Cleared Preliminary Exam: 13090 (13090/8 lakh=1.64%)

·      Cleared CSE Main Exam: 2529 (0.32%)

·      Total Vacancies: 861 (0.1%)

·      Chance of a candidate getting selected for Civil Services: 0.1%

Why did ChatGPT fail?*

·      No match for average human intelligence. Someone has observed that AI at present has the ‘intelligence’ far below that of a cockroach, and he may be right.

·      No strategy. A smart candidate would quickly scan the Qs from beginning to end, answer the Qs where she is sure she is correct, skip the doubtful ones (the golden rule where incorrect answer earns a penalty); GSP I – 50 correct answers out of 100 Qs is enough to clear the exam, GSP II – 27 correct answers out of 80 Qs earn the minimum qualifying marks. No need to try to solve all Qs. Very few candidates have the time to solve all Qs and get it right.

·      No knowledge of previous years’ Qs and cut off, and the criteria of cut off.

·      I have analysed ChatGPT’s incorrect answers, and have a good idea why it goofed up. Should I publish my research like the Stanford Professor, or wait for OpenAI to make me an attractive offer? Let me think over it.

Disclaimers:

1.     My method of random sampling could be problematic. A better method may be to pick up a Question from each category such as Indian Economy, Constitution and governance, Indian history, Environment, Technology, arithmetic, problem solving, comprehension, etc.

2.     A larger sample, say of 20 Qs from each Paper, may throw up a different result.

3.     C may perform better if asked to take the full test (GSP I -100 Qs, GSP II-80 Qs).

4.     I did not tell C about the penalty for an incorrect answer, and the option to pass a question. However, C sometimes says: I don’t know about that or I can’t answer that question. In my test, it answered all the questions readily, and explained the basis of each answer.

5.     C is handling millions of Q.s every minute including the test Q.s I threw at it. Of course, it has access to the neural network of supercomputers, and has been trained on 300 billion pages of text, more than a human can read, and remember in several lifetimes.

6.     Once trained on UPSC Question Papers of previous years, C’s performance may improve.

Take a Test

You are, no doubt, among the brightest and the best. To reassure yourself that your grey cells are still robust and agile, would you like to take a Snap Test of a single question which is not too difficult to solve? Correct answer earns you 2.5 marks, and incorrect answer has a penalty of (-) 0.8; you have 1.5 mins to solve it, and the option to answer or give it a pass, for which there is no penalty.

Your TIME (1.5 mins) starts now.

Question**

The digits 1 to 9 are arranged in three rows in such a way that each row contains three digits, and the number formed in the second row is twice the number formed in the first row; and the number formed in the third row is thrice the number formed in the first row. Repetition of digits is not allowed. If only three of the four digits 2, 3, 7 and 9 are allowed to use in the first row, how many such combinations are possible to be arranged in the three rows?

(a)  4 (b) 3 (c) 2 (d) 1

I guess you solved it, got it right, and within 1.5 mins. Congratulations, you scored 2.5 marks.

**A Question from Civil Services (Preliminary) Examination, 2022, General Studies Paper II, and ChatGPT gave an incorrect answer for this question!

Update*

I published this blog on 22/02/23. The next day, I asked ChatGPT:

Hi, did you fail IAS exam by UPSC, India?

C replied:

I'm sorry, but I cannot take the IAS exam or any other exam as I am an AI language model designed to assist with various tasks such as answering questions, generating text, and providing information on various topics. I have not taken any exams, nor am I capable of doing so. My purpose is to provide helpful responses and assist users in their queries to the best of my ability.

***

Note: 

  • Author's profile is at http://amazon.com/author/pkdash
  • Books by this author are available on Amazon.in, Kindle eBook, Flipkart, and Notion Press, Chennai.

***  

 

 

2 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. This is an excellent experiment with ChatGPT (C); I liked the honesty with which ChatGPT responded to author's last question.
    I tested C for atleast one of the research questions I have answered in my PhD thesis after doing lots of research; C's responses are useful, though superficial but nowhere close to the details with which my thesis has answered.
    However, over time, I am sure, C will gather enough experience and take away many people's job!

    ReplyDelete

Advisory for Husbands: Dare NOT, to Stare!

Advisory for Husbands: Dare NOT, to Stare! A copy of an Advisory issued by IIWA (India Inc. Wives’ Association)  to IIHA (India Inc. Husba...