# Hosting bloom-7b1 on Amazon SageMaker using HuggingFace Text Generation Inference (TGI)

![TGI Architecture](https://huggingface.co/spaces/text-generation-inference/README/resolve/main/architecture.jpg)

Text Generation Inference (TGI) is a Rust, Python and gRPC server for text generation inference.

This [notebook](./hf-tgi-bloom7b1.ipynb) shows how to deploy [bigscience/bloom-7b1](https://huggingface.co/bigscience/bloom-7b1), an open-access Multilingual language model, to an Amazon SageMaker real-time endpoint with TGI backend.

For a list of optimized architectures for hosting with TGI can be found [here](https://github.com/huggingface/text-generation-inference#optimized-architectures)

## References

1. <https://github.com/huggingface/text-generation-inference>
2. <https://huggingface.co/bigscience/bloom-7b1>
3. <https://github.com/huggingface/text-generation-inference#optimized-architectures>