Dataset: data/dilemmas_with_detail_by_action.csv
Croissant metadata: metadata/crossiant
Three use cases -- 1. evaluate model preferences on everyday dilemmas 2. evaluate model preference on dilemmas based on the given principle 3. evaluate the model's steerability by system prompt modulation on each given principle
We used gpt-4-turbo
as demonstrated below. We also provided our model responses on our dilemmas dataset run by five models, and some corresponding analysis datasets for replication purposes to avoid running them again through the flag (replication_purpose
).
$ conda create --name <env> --file requirements.txt
python eval/evaluate_model_on_dilemma.py \
--output_jsonl_file_div <output_jsonl_file_div> \
--model <gpt-4-turbo> \
--model_name_for_variable <own_created_name_e.g.gpt4> \
--replication_purpose <full/only_first_five> \
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
output_jsonl_file_div
:eval/model_response_from_eval
.model name
:gpt-4-turbo
.replicaiton_purpose
: [full
,only_first_five
]: [full dilemmas dataset, only first five dilemmas]
Optional Arguments:
api_key
: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
python eval/combine_model_eval_with_dilemmas_action_separate.py \
--input_model_resp_file <step_1_output_jsonl_file> \
--input_dilemma_file <MoralDilemma_dataset> \
--output_analysis_file <output_analysis_file_div> \
--model <model_name_for_variable>
Required Arguments with suggestions:
input_model_resp_file
:eval/model_response_from_eval
input_dilemma_file
:data/dilemmas_with_detail_by_action.csv
ordata/dilemmas_with_detail_by_action_test.csv
if use only first five data in step 1output_analysis_file
:eval/model_response_from_eval
. The data of<model_name>_eval_on_dilemmas_with_action_separate.csv
will be created.
3. Analyze the combined model responses file with dilemmas details from step 2 based on the five theories.
python analysis/analysis_for_model_responses_five_theories/analyze_model_resp_with_five_theories.py \
--input_model_resp_file <output_from_step_2> \
--output_analysis_file <e.g.analysis/analysis_for_model_responses_five_theories/model_resp.csv> \
--models <model_names>
Required Arguments with suggestions:
input_model_resp_file
: use output file from step 2. e.g.,eval/model_response_from_eval/<model_name>_eval_on_dilemmas_with_action_separate.csv
. We also provided the six model responses ineval/model_response_from_eval/all_models_eval_on_dilemmas_with_detail_by_action.csv
models
:gpt4
; For six models,gpt4 gpt35 llama2 llama3 mixtral_rerun claude
python analysis/analysis_for_model_responses_five_theories/create_plot.py \
--input_analysis_file <output_path_in_step_3/model_resp.csv> \
--output_graph_div <path_for_graphs>
Required Arguments with suggestions:
input_analysis_file
: Output from step 3 e.g.,analysis/analysis_for_model_responses_five_theories/
1. Get the relevant values for each principle from collected values by prompting the model 10 times.
python eval_for_system_prompt_modulation/get_values_for_principle.py \
--input_principle_file <data/principle_company.csv> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/values_and_system_prompt_for_principle> \
--model <openai_model_e.g._gpt-4-0125-preview> \
--replication_purpose <full_/or/_only_first_two_principles>\
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
input_principle_file
:data/principle_openai.csv
model
:gpt-4-0125-preview
replicaiton_purpose
: [full
,only_first_two_principles
]
Optional Arguments:
api_key
: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
2. Calculate the value's relevance by the empirical probabilities from the 10 model responses in step 1.
python eval_for_system_prompt_modulation/calc_value_relevance.py \
--input_system_prompt_file <output_from_step_1> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/values_and_system_prompt_for_principle>
Required Arguments with suggestions:
input_system_prompt_file
: Output from step 1 e.g.,eval_for_system_prompt_modulation/values_and_system_prompt_for_principle/principle_openai.csv
python eval_for_system_prompt_modulation/get_values_conflicts_and_relevant_dilemmas_for_principle.py \
--input_principle_file <output_from_step_2_e.g._principle_company_clean.csv> \
--input_dilemma_file <output_from_case_1_step_2> \
--output_jsonl_file_div <eval_for_system_prompt_modulation/values_and_system_prompt_for_principle> \
--model <model_name> \
Required Arguments with suggestions:
input_principle_file
: Output from step 2. e.g.,eval_for_system_prompt_modulation/values_and_system_prompt_for_principle/<model_name>_clean.csv
input_dilemma_file
: Output from Step 2 of Case 1. We also provided the six models responses:eval/model_response_from_eval/all_models_eval_on_dilemmas_with_detail_by_action.csv
1. Generate system prompts for each principle -- one for steering to supporting values and one for steering to opposing values for each principle
python eval_for_system_prompt_modulation generate_system_prompts_for_principle.py \
--input_system_prompt_file <output_from_step_3_in_case_2> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/values_and_system_prompt_for_principle> \
--model gpt-4-turbo \
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
input_system_prompt_file
: Output from step 3 of case 2: e.g.,<output_path>/principle_gpt4_eval_with_values_conflicts_and_dilemma.csv
Optional Arguments:
api_key
: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
python eval_for_system_prompt_modulation/evaluate_model_on_system_prompt.py \
--input_system_prompt_file <output_from_step_1> \
--input_dilemma_file <data/dilemmas_with_detail_by_action.csv> \
--output_jsonl_file_div <e.g._eval_for_system_prompt_modulation/model_response_on_system_prompt/> \
--model gpt-4-turbo \
--api_key <optional_you_can_set_as_.env_or_global>
Required Arguments with suggestions:
input_system_prompt_file
: Output from step 1: e.g.,<output_path>/<model>_with_system_prompts.csv
input_dilemma_file
: MoralDilemmas data: ```data/dilemmas_with_detail_by_action.csv``model
:gpt-4-turbo
Optional Arguments:
api_key
: you can input OpenAI API key as an argument. Use a .env file or set as a global variable
python analysis/analysis_for_system_prompt_modulation/analyze_model_resp_on_system_prompt.py \
--input_principle_file <output_from_step_2> \
--input_dilemma_file <output_from_case_1_step_2_or_provided_file> \
--output_div <e.g.analysis/analysis_for_system_prompt_modulation> \
--steer <sup/opp> \
--company_name <use_for_create_output_file_name_e.g._openai> \
--model <the_name_used_for_field_and_also_for_output_file_name>
Required Arguments with suggestions:
input_principle_file
: Output from step 2. e.g.,eval_for_system_prompt_modulation/model_response_on_system_prompt/{model}_eval.csv
input_dilemma_file
: Output from Step 1 in Case 1:eval/model_response_from_eval/<model_name>_eval_on_dilemmas_with_detail_by_action.csv
or our provided six model responses:eval/model_response_from_eval/all_models_eval_on_dilemmas_with_detail_by_action.csv
steer
:sup
means that it analyzes the model responses when having system prompts steering to supporting values.
python analysis/analysis_for_system_prompt_modulation/create_plot.py \
--input_support_value_file <path/...system_prompt_sup_value.csv> \
--input_oppose_value_file <path/...system_prompt_sup_value.csv> \
--output_graph_div <analysis/analysis_for_system_prompt_modulation/> \
--replication_purpose <only_two_principles/or/full>
Required Arguments with suggestions:
input_[support/oppose]_value_file
: Output from step 3 with steer == [sup
/opp
]:principle_<companyname>_<modelname>_eval_with_system_prompt_system_prompt_[sup/opp]_value.csv
replication_purpose
: [only_two_principles
/full
]. Depends on the evaluation of Step 1 in case 2