3.1 C
New York
Saturday, January 18, 2025

Posit AI Weblog: Safety Tensors 0.1.0


safetensors is a brand new, easy, quick and protected instrument file format to retailer tensioners. The design of the file format and its unique implementation is finished by Hugging Face, and it’s changing into increasingly more largely adopted of their standard ‘transformers’ framework. The Safetensor R package deal is a pure R implementation, permitting you to learn and write Safetensor information.

The preliminary model (0.1.0) of the safety tensors is now on CRAN.

Motivation

The primary motivation for safety tensors within the Python group is safety. As identified within the official documentation:

The primary purpose for this field is to get rid of the necessity to use pickle in PyTorch, which is utilized by default.

Pickle is taken into account an unsafe format, for the reason that motion of loading a Pickle file can set off the execution of arbitrary code. This has by no means been a priority for R customers, because the Pickle parser included in LibTorch solely helps a subset of the Pickle format, which doesn’t embrace code execution.

Nevertheless, the file format has extra benefits over different generally used codecs, together with:

  • Assist for lazy loading: You may select to learn a subset of the tensors saved within the file.

  • Zero copy: Studying the file requires no extra reminiscence than the file itself. (Technically, the present R implementation makes a single copy, however it may be optimized if we actually want it in some unspecified time in the future.)

  • Easy: Implementing the file format is straightforward and doesn’t require advanced dependencies. This implies it’s a good format for exchanging tensors between ML frameworks and between totally different programming languages. For instance, you may write a safety tensor file in R and cargo it in Python, and vice versa.

There are extra benefits over different frequent file codecs on this area, and you’ll see a comparability desk right here.

Format

The format of the security tensioners is described within the following determine. It is mainly a header file containing some metadata, adopted by uncooked tensor buffers.

Primary use

Security turnbuckles could be put in from CRAN utilizing:

Nick Fewings in unpack

Re-use

Textual content and figures are licensed underneath a Artistic Commons Attribution license. CC BY 4.0. Figures which were reused from different sources usually are not lined by this license and could be acknowledged by a observe of their caption: “Determine of…”.

Quotation

For attribution, please cite this work as

Falbel (2023, June 15). Posit AI Weblog: safetensors 0.1.0. Retrieved from https://blogs.rstudio.com/tensorflow/posts/2023-06-15-safetensors/

BibTeX Quotation

@misc{safetensors,
  writer = {Falbel, Daniel},
  title = {Posit AI Weblog: safetensors 0.1.0},
  url = {https://blogs.rstudio.com/tensorflow/posts/2023-06-15-safetensors/},
  12 months = {2023}
}

Related Articles

Latest Articles