lenskit/lkpy

View on GitHub
docs/releases/2024.rst

Summary

Maintainability
Test Coverage
2024 Releases
=============

The 2024 release series is **currently in development**.  No specific release
date is set yet.

2024 will bring breaking changes to several LensKit APIs to improve ergonomics,
correctness-by-default, and flexibility.  It also adopts SPEC0_, a standard for
supported versions of scientific Python libraries, and changes the LensKit
version number scheme to “:ref:`SemCalver`”.

.. _SPEC0: https://scientific-python.org/specs/spec-0000/

2024.1 (in progress)
--------------------

The first 2024 release is currently in-progress.

This document presents the highlights for this release. The full changelog for this release is available in the `Git history <https://github.com/lenskit/lkpy/compare/0.14.4...main>`_
and `issue/PR milestone <https://github.com/lenskit/lkpy/milestone/14>`_.

Significant Changes
~~~~~~~~~~~~~~~~~~~

2024.1 brings substantial changes to LensKit.

*   **PyTorch**. LensKit now uses PyTorch to implement most of its algorithms,
    instead of Numba-accelerated NumPy code.  Algorithms using PyTorch are:

    * :py:class:`~lenskit.algorithms.knn.ItemItem`
    * :py:class:`~lenskit.algorithms.als.ImplicitMF`
    * :py:class:`~lenskit.algorithms.als.BiasedMF`

*   Many LensKit components (batch running, model training, etc.) now report progress with
    :py:mod:`progress_api`, and can be connected to TQDM or Enlighten.

*   Algorithms refactored to more consistent package locations, without the
    ``algorithms`` subpackage.

*   LensKit no longer has top-level exports (and is now in fact a namespace
    package).  Classes and functions must be imported from appropriate
    subpackages.

*   The ``Popular`` recommender has been removed in favor of :class:`~lenskit.algorithms.basic.PopScore`.

New Features (incremental)
~~~~~~~~~~~~~~~~~~~~~~~~~~

* Added RBP top-N metric (:pr:`334`).
* Added command-line tool to fetch datasets (:pr:`347`).

Model Behavior Changes
~~~~~~~~~~~~~~~~~~~~~~

Most models will exhibit some changes, hopefully mostly in performance, due to
moving to PyTorch.  There are some deliberate behavior changes in this new version,
however, documented here.

* ALS models only use Cholesky decomposition (previously selected with the
  erroneously-named ``method="lu"`` option); conjugate gradient and coordinate
  descent are no longer available.  Cholesky decomposition is faster on PyTorch
  than it was with Numba, and is easier to maintain.
* The default minimum similarity for :class:`~lenskit.algorithms.knn.UserUser`
  is now :math:`10^{-6}`.
* k-NN algorithms no longer support negative similarities; ``min_sim`` is clamped
  to be at least the smallest normal in 32-bit floating point (:math:`1.75 \times 10^{-38}`).

Bug Fixes
~~~~~~~~~

* Fixed bug in NDCG list truncation (:issue:`309`, :pr:`312`).
* :py:func:`lenskit.util.clone` now properly clones tuples (:pr:`358`).
* Corrected documentation errors for :py:func:`~lenskit.metrics.topn.recall` and :py:func:`~lenskit.metrics.topn.hit` (:pr:`369` by :user:`lukas-wegmeth`).

Dependencies and Maintenance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Bumped minimum supported dependencies as per SPEC0_ (Python 3.10, NumPy 1.23, Pandas 1.5, SciPy 1.9).
* Added support for Pandas 2 (:pr:`364`) and Python 3.12.
* Improved Apple testing to include vanilla Python and Apple Silicon (:pr:`366`).
* Updated build environment, dependency setup, taskrunning, and CI to more consistent and maintainable.
* Removed legacy random code in favor of :py:mod:`seedbank` (:pr:`351`).
* Code is now auto-formatted with Ruff.