We provide the following fields per job:


  1. crawl_timestamp: this refers to the time the data is crawled by our crawlers.

  2. url: the URL from which this particular job data is crawled.

  3. job_title: the job title as given in the page.

  4. category: the category as given in the page or from the URL pattern.

  5. company_name: the company name as given on the page.

  6. city: the city of the job posting as given in the page. Might not be always present.

  7. state: the state of the job posting as given in the page. Might not be always present.

  8. country: the country of the job posting as given in the page. Should be present for all the jobs. If missing, it's a serious problem.

  9. inferred_country: this is a system-generated field. The country field will have different values for the same country depending on the site like IN, IND, India all referring to India. This field will contain the normalized country name.

  10. inferred_city: This is a system-generated field. normalized city names, similar to the inferred_country. The presence of the value in this field depends on the value in the city field.

  11. inferred_state: This is a system-generated field. normalized state names, similar to the inferred_country. The presence of a value in the field depends on the value in the state field.

  12. post_date: The date when this job is posted as given in the page.

  13. job_description: The description of the job as given in the page.

  14. job_type: The type of the job as given in the page. 

  15. salary_offered: The salary offered for the job as given in the page.

  16. job_board: The normalized value of the job board name from where the job is taken.

  17. cursor: This is a system-generated field. The incremental unique value for each job. Used for pagination when the client or internal team wants to fetch a bulk amount of jobs in an incremental manner.

  18. contact_email: The contact email as given in the page. Always not present.

  19. contact_phone_number: The contact phone number as given in the page. Always not present.

  20. uniq_id: The system generates a unique id for each job.

  21. html_job_description: The job description in HTML format as given in the job page.

  22. valid_through: System generated field. Date up until this job is valid. Date as given in the page. It is not always present.

  23. Valid_through: this refers to the validity date for that job

  24. inferred_iso3_lang_code: This is a system-generated field representing the language of the job posting. Internally it depends on the field job_post_lang.

  25. has_expired: This is a system-generated field. Default values are false. It will be true if the system found the job to be expired.

  26. latest_expiry_check_date:  This is a system-generated field. The date which the system last checked for the expiry status of this particular job posting.

  27. duplicate_status: This is a system-generated field. Default is 'NA', If the current job is found to be duplicate/ very similar to another job it will be marked as 'yes'. If it's not duplicate or very similar to any other job it will be marked as 'no'.

  28. duplicate_of: This is a system-generated field. If the system finds this job to be duplicate/very similar to another job, that jobs id will be updated here since this job is a duplicate of that job.