[multistep] Merge master and fix python binding. (#365)

* update vw version (#307) * fb to dsjson log converter (#303) * Add fb to dsjson joined log converter as an option of parser * [binary parser] simple e2e test for metrics (#309) - only for linux * Fix bug with CB action with _slot_id breaking hard rl.net.cli in sim mode. (#310) * [Binary parser] more error handling and logging (#311) * [Binary parser] refactor reward related structs and methods (#314) * Ataymano/i joiner (#315) * refactor reward related structs and methods * fix header imports * i_joiner interface * initialize parser from i_joiner* * virtual -> override * unique_ptr<i_joiner>&& as init arg * fixes Co-authored-by: Olga Vrousgou <[email protected]> * [Binary parser] fix reward setting bugs (#316) * update vw to 15c197 (#318) - update header that got removed on vw - update signature of metrics from tuple to pair - this will enable --dsjson --extra_metrics changes from VW: 15c197975 fix: bfgs preconditioner, Resolves #2745 (#2995) 98876a2ff feat: [metrics] add optional metrics for --dsjson (#2992) 462b4c82f feat: [metrics] add support for float (#2991) b250d8565 refactor: Integrate count and sum feat sq into the interaction generation routine (#2987) a7891c9a6 fix: Added save_load to squarcb (#2988) 733548daa style: Use float instead of float ref (#2986) cd89d7b26 build: [WIP] Enable building C# projects on Windows with CMake (#2929) 6cd559100 style: Give template types more descriptive names (#2984) 7f49e95a0 fix: Add read_line_json_s function taking line length (#2968) 549ebcdeb refactor: Support const iterators to feature groups (#2972) 5d86cafa4 refactor: Do not use cstyle casts (guideline 16) (#2981) 1c31f950d fix: Unify duplicated error code headers (#2978) 4f0fe622f fix: use nullptr instead of NULL (#2980) 4eeecd343 fix: Avoid float to double promotion by using C++ version of math functions (#2979) e3769cf09 Fixing json parse issue when pdrop is 0 (#2977) 89a7fe1b2 fix: fixes for headers not including things they need (#2974) b1f853988 refactor: Remove unused foreach_feature func (#2973) * add cb-loop to example_gen and make it accept a config file (#322) * Gets CMake working on Windows (#320) * Also enables building dotnet bindings via CMake on Windows * [Binary parser] add test that compares json and fb joined logs's outputs (#323) * mac ci: continue on error true (#327) ignoring until #300 is fixed or we figure out why commit 13950d0 started consistently failing on mac ci. * update vw to 8a24c1b4e (#324) 8a24c1b4e fix: [metrics] account for label action not predicted (#3004) 7b490aecc fix: cb_explore_adf_first.cc save_load (#2997) b5939a84b Update dev environment to use new image (#3001) 97e137989 refactor: [metrics] cb_explore_adf optional metrics (#2998) 060c4c01f fix: keep gamm_exponent in squarecb (#2999) 7a495295f fix: [metrics] add missing ccb tests (#2996) * [binary parser] set build default to release (#325) * add file format to README (#329) * Enable import of native msbuild projects into CMake-generated solution (#332) * Enable native VS .NET projects in CMake generated sln * Add utility target to launch VS with CMake build environment * Fix target file location for rltest-onnx * more info in README for binary parser (#334) * Python binding overhaul, automatic binary and documentation generation (#330) * WIP python bindings * rl static deps * update args * update ci * update build image * remove other platforms for now * add docs section * Write migration guide * rename from main to py_api.cc * remove swig pieces * migrate examples * add build to ignore * More updates * Remove unnecessary bits of conf * remove duplicated add_subdirectory * Default static deps to off * test ci * fetch submodules * fix doc gen * Update build_python_wheels.yml * Update build_python_wheels.yml * Fix examples * Support additional cmake args * Add stats log replay tool. (#331) * Add stats log replay tool. * Simplify data loading and add documentation. * feat: Generate and upload docs for every commit (#336) * supply SPHINXBUILD to python doc builder ci (#339) * ci: Another fix for doc building (#340) * Ataymano/multistep joiner (#341) * update vw version (#338) dsjson parsing will now skip malformed lines instead of throwing exception. Run with --strict_parse to throw instead of skip. commits: 04d218b fix: [metrics] set to zero (#3035) 860c525 feat: [metrics] cb_explore_adf additional metrics (#3032) 110a3e4 feat: [py] DFtoVW -> Add weight attribute to SimpleLabel (#3033) b7625e8 feat: Create class to manage dual indexing of feature_groups (#3029) 6335dd1 refactor: Avoid v_array copies (#3030) 3df4973 feat: add flag to limit log output (#3021) 888e6f7 refactor: cb_explore_pdf (#3026) 99c645b feat: [py] Add contextual bandit label to DFtoVW (#2713) 3089e1b feat: add generic_range adapter class (#3025) 001cbb4 refactor: cb_explore_adf_synthcover (#3024) f1fab51 refactor: cb_explore_adf_greedy (#3023) 21e9f8b fix: set default learn_returns_prediction to false, refactor cover (#3017) 9abf890 feat: [py] add api for learner metrics (#3022) eb56eb7 refactor: Non printable namespaces should not be wildcard expanded (#3020) fbe1e6b feat: [dsjson] Malformed lines are now skipped instead of being fatal (#3007) 4781c28 refactor: CCB participates in interaction reduction expansion (#3013) 613c93c refactor: cb_explore_adf_bag.cc make_reduction_learner (#3016) a361623 refactor: refactor cb_explore_adf_rnd.cc with make_reduction_learner (#3014) 6ece769 refactor: cb_explore_adf_softmax.cc with make_reduction_learner (#3015) c7ea383 refactor: eval_count_of_generated_ft PR follow up (#3011) 88b0eaf fix: refactor cb_explore_adf_first (#3012) 62bc992 fix: Refactor regcb with make_reduction_learner (#3010) d0727e7 fix: Refactor squarecb with make_reduction_learner (#3009) 72c1db6 fix: warning for const qualifier on reference type (#3008) 26ada35 feat: support cached interaction expansion as a reduction (#2993) 17953f7 fix: regcb save_load and keep mellowness (#3006) bf2162e fix: remove non printable ascii char from tests (#3005) * Add skip learn logic to binary file parser (#333) * Add skip learn logic to binary file parser * fix: make launch_vs command set envvars correctly (#335) The previous version of this was setting the BaseOutputFolder (which is changed under CMake when building DotNet bindings to ensure that native dlls are next to the managed dlls that require them) with quotes inside the variable. The fix is to quote the full set arg, so that quotes are not included in the value. * [Binary parser] refactoring to make room for more loop types (#343) * ci: Restrict ci to run only on master or release branches or all PRs (#344) * [Binary parser] More refactoring and some initial ccb parsing (#345) * vw pdrop hotfix pickup (#349) * Add appid to event metadata for V2 schema (generic_event only) (#337) * Skip learn metrics (#346) * Skip learn metrics * [Binary parser] ccb parsing (#347) * Updating VW from dbf5ae7ca to aaabadd9b (#350) Co-authored-by: irvinec <[email protected]> * call c_str() safely (#354) * Update bin log spec (#357) * fix small bugs * Handle file header and magic as messages and allow them at any point in the stream. * Ignore events if no checkpoint info is available. * Fix test suite. * Remove binary_parser::read_magic as it's no longer used by parse_examples. * Add tests for support of message reordering. This adds new tool `log_gen` that takes a string and output a log with the set of messages as defined in it. This was used to cook the test files used in the test suite. * Move config readiness to i_joiner. * fix style and add documentation. * converter to skip learning (#358) Co-authored-by: Rodrigo Kumpera <[email protected]> * Support file format update. (#359) * [binary-log] Add CLI override to checkpoint parameters. (#360) CLI overrides must be sticky to be useful, so we introduce the notion of sticky values that once set, can't be modified. This is done by having a sticky_value<T> container in which the setter takes an argument which controls the stickyness of the value. This way we can use it for both CLI sources and file sources. * Fix default case warning. * Fix python build doc. * Fix two issues that slipped into master by accident. (#362) * Updating VW commit from aaabadd to ecadd8d (#363) From commit aaabadd9b9af5d9970b08bfc7999a3382b6e6dc5 To commit ecadd8d8b622a74e4deed519e04ffe92ce4a9892 ecadd8d8b fix: Multiline Example handling when in multi-pass mode (#3143) 473f06263 test: get type information of VW options (#3142) b5782bcae fix: Fix CB label caching (#3141) 24a2434ac refactor: return ScoredDual with bound (#3121) a1b885252 test: update runtests.AUTOGEN.json (#3140) fb1a5a817 refactor: remove all usages of delete_v and deprecate (#3138) 237e0f799 refactor: Don't use a shared ptr for audit (#3134) 9821ab91f refactor: deprecate some v_array functions (#3136) 7958d62dc refactor: use destructor to manage memory (#3131) f2bd8e5e7 refactor: remove all usages of v_init and deprecate (#3135) 9c2ef84f2 refactor: simplify search memory management (#3132) e1bef6108 ci: lock to specific version of google test in benchmark CI (#3137) b8ed478a9 refactor: do not need to call delete_v anymore (#3130) b2b7c9aeb fix: [Python] Fix default parameter convert_labels in sklearn_vw.VWRegressor (#3129) 86c374643 refactor: deprecate ezexample (#3128) 723743a05 fix: cats getting stuck to predicting tree edges for some bandwidths (#3114) e0733c422 fix: use nullptr ostream when quiet is on (#3119) 863eee1a5 fix: fix two compiler warnings (#3123) ec0accec9 fix: Check label type for shared features, Resolves #3088 (#3120) b7d41f9b3 fix: [cb_to_cb_adf] multiple costs in label (#3126) b06f76788 test: add to cmakelists unit test for cb_dro (#3117) 49fb54207 ci: Build docs as part of CI (#3090) 19c2031bc build: MSVC check must be done before project call (#3113) 16562d846 chore: Update deprecations to be explicit about 9.0 release (#3103) e539f3c7d fix: setup_example should not be called twice on the same example (#3115) eddadc095 fix: [DFtoVW] Fix issue with nan/None in feature (#3109) ad39f4262 ci: Add Python type checking (#3105) 26442043b test: [py] fix seed for sim test (#3101) 6e052afbf feat: add learner builder for no-data learners (#3100) 57affdc57 chore: add deprecation notice for #3084 feature_self_interactions (#3099) 9241c148a refactor: remove cast in slim tests (#3098) 28fc4baec test: add tests for namespaces with same first letter (#3096) becb082a2 feat: [Tutorial] Using DFtoVW and exploring VW output (#3068) 7f62d9fb4 chore: fix deprecation issue template (#3092) 3e7d62d61 test: [py] fix random failure by lowering value (#3089) 4ee7d96a9 chore: add requirements file to be used by binder (#3087) 3babe2213 feat: add chained_proxy_iterator to make it easier to iterate entire namespace index groups (#3076) c7ff60303 refactor: Fix unnecessary usages of char* (#3085) 5c8fd55bc chore: Add deprecation template (#3028) 258e89f2c feat: [pyvw] Get weight from name (#3042) 4aa7728cc feat: add cb benchmark with more features (#3080) ed13b81c9 fix: fix gdb pretty printer (#3079) Co-authored-by: irvinec <[email protected]> * Fix episodic python binding. Co-authored-by: Eduardo Salinas <[email protected]> Co-authored-by: cheng-tan <[email protected]> Co-authored-by: Rodrigo Kumpera <[email protected]> Co-authored-by: olgavrou <[email protected]> Co-authored-by: Alexey Taymanov <[email protected]> Co-authored-by: olgavrou <[email protected]> Co-authored-by: Jacob Alber <[email protected]> Co-authored-by: Jack Gerrits <[email protected]> Co-authored-by: Sheetal Lahabar <[email protected]> Co-authored-by: Casey Irvine <[email protected]> Co-authored-by: irvinec <[email protected]>
VowpalWabbit · Jul 13, 2021 · fd281d1 · fd281d1
1 parent 8a9e8e5
commit fd281d1
Show file tree

Hide file tree

Showing 7 changed files with 87 additions and 46 deletions.
diff --git a/.gitignore b/.gitignore
@@ -31,4 +31,6 @@ ext_deps
 /bindings/cs/rl.net.cli/InternalsVisibleToTest.cs
 /test_tools/log_parser/reinforcement_learning/
 _build
-.idea
+.idea
+dist/
+rl_client.egg-info/
diff --git a/README.md b/README.md
@@ -175,6 +175,5 @@ cmake .. -G "Visual Studio 15 2017 Win64" -DCMAKE_TOOLCHAIN_FILE=[vcpkg root]\sc
 
 ## Make targets
 - `doc` - Python and C++ docs
-- `_rl_client` - Python bindings
 - `rlclientlib` - rlclient library
 - `rltest` - unit tests
diff --git a/bindings/python/README.md b/bindings/python/README.md
@@ -7,11 +7,11 @@ Commands are relative to repo root.
 python setup.py install
 
 # Or, if vcpkg used for deps
-python setup.py --cmake-options="-DCMAKE_TOOLCHAIN_FILE=\"/path_to_vcpkg_root/scripts/buildsystems/vcpkg.cmake\"" install
+python setup.py --cmake-options="-DCMAKE_TOOLCHAIN_FILE=/path_to_vcpkg_root/scripts/buildsystems/vcpkg.cmake" install
 ```
 
 - For Ubuntu 20.04, Python 3.8 a recommended vcpkg version is: `Release 2020.06, commit 6185aa7`
 
 ## Usage
 
-After successful installation, an example is in [`examples/python/basic_usage.py`](../../examples/python/basic_usage.py).
+After successful installation, an example is in [`examples/python/basic_usage.py`](../../examples/python/basic_usage.py).
diff --git a/bindings/python/docs/migration_guide.rst b/bindings/python/docs/migration_guide.rst
@@ -40,7 +40,7 @@ Changes to:
 
     # ...
 
-    client = rl_client.live_model(_, on_error)
+    client = rl_client.LiveModel(_, on_error)
 
 
 3. Init
@@ -57,12 +57,12 @@ Changes to:
 
 .. code-block:: python
 
-    client = rl_client.live_model(config)
+    client = rl_client.LiveModel(config)
 
 4. `choose_rank` return value
 -----------------------------
 
-`choose_rank` no longer returns a tuple, but now returns a :meth:`rl_client.ranking_response` object that contains the same information as was contained in the tuple.
+`choose_rank` no longer returns a tuple, but now returns a :meth:`rl_client.RankingResponse` object that contains the same information as was contained in the tuple.
 
 .. code-block:: python
 

diff --git a/bindings/python/py_api.cc b/bindings/python/py_api.cc
@@ -4,6 +4,7 @@
 #include "config_utility.h"
 #include "constants.h"
 #include "live_model.h"
+#include "multistep.h"
 
 #include <exception>
 #include <memory>
@@ -134,8 +135,10 @@ PYBIND11_MODULE(rl_client, m) {
         Container class for all configuration values. Generally is constructed from client.json file read from disk
     )pbdoc")
       .def(py::init<>())
-      .def("get", &rl::utility::configuration::get, py::arg("name"),
-           py::arg("defval"), R"pbdoc(
+      .def("get", &rl::utility::configuration::get,
+           py::arg("name"),
+           py::arg("defval"),
+           R"pbdoc(
         Get a config value or default.
 
         :param name: Name of configuration value to get
@@ -161,7 +164,8 @@ PYBIND11_MODULE(rl_client, m) {
              THROW_IF_FAIL(live_model->init(&status));
              return live_model;
            }),
-           py::arg("config"), py::arg("callback"))
+           py::arg("config"),
+           py::arg("callback"))
       .def(
           "choose_rank",
           [](rl::live_model &lm, const char *context, const char *event_id,
@@ -174,7 +178,9 @@ PYBIND11_MODULE(rl_client, m) {
                 lm.choose_rank(event_id, context, flags, response, &status));
             return response;
           },
-          py::arg("context"), py::arg("event_id"), py::arg("deferred") = false,
+          py::arg("context"),
+          py::arg("event_id"),
+          py::arg("deferred") = false,
           R"pbdoc(
         Request prediction for given context and use the given event_id
 
@@ -184,17 +190,31 @@ PYBIND11_MODULE(rl_client, m) {
           "choose_rank",
           [](rl::live_model &lm, const char *context, bool deferred) {
             rl::ranking_response response;
+            rl::api_status status;
             unsigned int flags = deferred ? rl::action_flags::DEFERRED
                                           : rl::action_flags::DEFAULT;
-            rl::api_status status;
             THROW_IF_FAIL(lm.choose_rank(context, flags, response, &status));
             return response;
           },
-          py::arg("context"), py::arg("deferred") = false, R"pbdoc(
+          py::arg("context"),
+          py::arg("deferred") = false,
+          R"pbdoc(
         Request prediction for given context and let an event id be generated
 
         :rtype: :class:`rl_client.RankingResponse`
     )pbdoc")
+      .def(
+          "request_episodic_decision",
+          [](rl::live_model &lm, const char* event_id, const char* previous_id, const char* context, rl::episode_state& episode) {
+            rl::ranking_response response;
+            rl::api_status status;
+            THROW_IF_FAIL(lm.request_episodic_decision(event_id, previous_id, context, response, episode, &status));
+            return response;
+          },
+          py::arg("event_id"),
+          py::arg("previous_id"),
+          py::arg("context"),
+          py::arg("episode"))
       .def(
           "report_action_taken",
           [](rl::live_model &lm, const char *event_id) {
@@ -208,14 +228,25 @@ PYBIND11_MODULE(rl_client, m) {
             rl::api_status status;
             THROW_IF_FAIL(lm.report_outcome(event_id, outcome, &status));
           },
-          py::arg("event_id"), py::arg("outcome"))
+          py::arg("event_id"),
+          py::arg("outcome"))
       .def(
           "report_outcome",
           [](rl::live_model &lm, const char *event_id, float outcome) {
             rl::api_status status;
             THROW_IF_FAIL(lm.report_outcome(event_id, outcome, &status));
           },
-          py::arg("event_id"), py::arg("outcome"))
+          py::arg("event_id"),
+          py::arg("outcome"))
+      .def(
+          "report_outcome",
+          [](rl::live_model &lm, const char *episode_id, const char *event_id, float outcome) {
+            rl::api_status status;
+            THROW_IF_FAIL(lm.report_outcome(episode_id, event_id, outcome, &status));
+          },
+          py::arg("episode_id"),
+          py::arg("event_id"),
+          py::arg("outcome"))
       .def("refresh_model", [](rl::live_model &lm) {
         rl::api_status status;
         THROW_IF_FAIL(lm.refresh_model(&status));
@@ -267,6 +298,15 @@ PYBIND11_MODULE(rl_client, m) {
             :rtype: list[(int,float)]
     )pbdoc");
 
+  // TODO: Expose episode history API.
+  py::class_<rl::episode_state>(m, "EpisodeState")
+      .def(py::init<const char *>())
+      .def_property_readonly(
+          "episode_id",
+          [](const rl::episode_state &episode) {
+            return episode.get_episode_id();
+          });
+
   m.def(
       "create_config_from_json",
       [](const std::string &config_json) {

diff --git a/examples/python/basic_usage.py b/examples/python/basic_usage.py
@@ -47,46 +47,45 @@ def basic_usage_cb():
 def basic_usage_multistep():
     config = load_config_from_json("client.json")
 
-    model = rl_client.live_model(config)
-    model.init()
+    model = rl_client.LiveModel(config)
 
-    episode1 = rl_client.episode_state("episode1")
-    episode2 = rl_client.episode_state("episode2")
+    episode1 = rl_client.EpisodeState("episode1")
+    episode2 = rl_client.EpisodeState("episode2")
 
     # episode1, event1
     context1 = '{"shared":{"F1": 1.0}, "_multi": [{"AF1": 2.0}, {"AF1": 3.0}]}'
     response1 = model.request_episodic_decision(
         "event1", None, context1, episode1)
-    print("episode id:", episode1.get_episode_id())
+    print("episode id:", episode1.episode_id)
     print("event id:", response1.event_id)
     print("chosen action:", response1.chosen_action_id)
 
     # episode2, event1
     context1 = '{"shared":{"F2": 1.0}, "_multi": [{"AF2": 2.0}, {"AF2": 3.0}]}'
     response1 = model.request_episodic_decision(
         "event1", None, context1, episode2)
-    print("episode id:", episode2.get_episode_id())
+    print("episode id:", episode2.episode_id)
     print("event id:", response1.event_id)
     print("chosen action:", response1.chosen_action_id)
 
     # episode1, event2
     context2 = '{"shared":{"F1": 4.0}, "_multi": [{"AF1": 2.0}, {"AF1": 3.0}]}'
     response2 = model.request_episodic_decision(
         "event2", "event1", context2, episode1)
-    print("episode id:", episode1.get_episode_id())
+    print("episode id:", episode1.episode_id)
     print("event id:", response2.event_id)
     print("chosen action:", response2.chosen_action_id)
 
     # episode2, event2
     context2 = '{"shared":{"F2": 4.0}, "_multi": [{"AF2": 2.0}, {"AF2": 3.0}]}'
     response2 = model.request_episodic_decision(
         "event2", "event1", context2, episode2)
-    print("episode id:", episode2.get_episode_id())
+    print("episode id:", episode2.episode_id)
     print("event id:", response2.event_id)
     print("chosen action:", response2.chosen_action_id)
 
-    model.report_outcome(episode1.get_episode_id(), "event1", 1.0)
-    model.report_outcome(episode2.get_episode_id(), "event2", 1.0)
+    model.report_outcome(episode1.episode_id, "event1", 1.0)
+    model.report_outcome(episode2.episode_id, "event2", 1.0)
 
 
 if __name__ == "__main__":

diff --git a/rlclientlib/logger/logger_facade.cc b/rlclientlib/logger/logger_facade.cc
@@ -80,31 +80,32 @@ namespace reinforcement_learning {
 
     int interaction_logger_facade::log(const char* context, unsigned int flags, const ranking_response& response, api_status* status, learning_mode learning_mode) {
       switch (_version) {
-      case 1: return _v1_cb->log(response.get_event_id(), context, flags, response, status, learning_mode);
-      case 2: {
-        v2::LearningModeType lmt;
-        RETURN_IF_FAIL(get_learning_mode(learning_mode, lmt, status));
-        generic_event::object_list_t actions;
-        generic_event::payload_buffer_t payload;
-        event_content_type content_type;
-
-        RETURN_IF_FAIL(wrap_log_call(_ext, _serializer_cb, context, actions, payload, content_type, status, flags, lmt, response));
-        return _v2->log(response.get_event_id(), std::move(payload), _serializer_cb.type, content_type, std::move(actions), status);
-      }
+        case 1: return _v1_cb->log(response.get_event_id(), context, flags, response, status, learning_mode);
+        case 2: {
+          v2::LearningModeType lmt;
+          RETURN_IF_FAIL(get_learning_mode(learning_mode, lmt, status));
+          generic_event::object_list_t actions;
+          generic_event::payload_buffer_t payload;
+          event_content_type content_type;
+
+          RETURN_IF_FAIL(wrap_log_call(_ext, _serializer_cb, context, actions, payload, content_type, status, flags, lmt, response));
+          return _v2->log(response.get_event_id(), std::move(payload), _serializer_cb.type, content_type, std::move(actions), status);
+        }
+        default: return protocol_not_supported(status);
       }
     }
-    
+
     int interaction_logger_facade::log(const char* episode_id, const char* previous_id, const char* context, const ranking_response& response, api_status* status) {
       switch (_version) {
-      case 2: {
-        generic_event::object_list_t actions;
-        generic_event::payload_buffer_t payload;
-        event_content_type content_type;
+        case 2: {
+          generic_event::object_list_t actions;
+          generic_event::payload_buffer_t payload;
+          event_content_type content_type;
 
-        RETURN_IF_FAIL(wrap_log_call(_ext, _multistep_serializer, context, actions, payload, content_type, status, previous_id, response));
-        return _v2->log(episode_id, std::move(payload), _multistep_serializer.type, content_type, std::move(actions), status);
-      }
-      default: return protocol_not_supported(status);
+          RETURN_IF_FAIL(wrap_log_call(_ext, _multistep_serializer, context, actions, payload, content_type, status, previous_id, response));
+          return _v2->log(episode_id, std::move(payload), _multistep_serializer.type, content_type, std::move(actions), status);
+        }
+        default: return protocol_not_supported(status);
       }
     }
 
@@ -184,7 +185,7 @@ namespace reinforcement_learning {
     , _v2(_version == 2 ? new generic_event_logger(
       time_provider,
       create_legacy_async_batcher<generic_event>(c, sender, watchdog, perror_cb, OBSERVATION_SECTION, _serializer_shared_state),
-      c.get(name::APP_ID, "")) : nullptr) {		
+      c.get(name::APP_ID, "")) : nullptr) {
     }
 
     int observation_logger_facade::init(api_status* status) {