01 May 2026 - tsp
Last update 01 May 2026
10 mins
So recently I stumbled over Tripo3D. For high resolution assets, an environment that also allows editing, interactive steering and for automated rigging this is still the tool of my choice. But then I also decided to take a look into locally hostable solutions. One of the better ones I stumbled over is Hunyuan3D by Tencent. Those models are available on HuggingFace. They provide single- and multi-view Image to 3D as well as Text to 3D models, also supporting texturing of the output. In contrast to the commercial Tripo3D it also does not support automatic rigging (i.e. the generation of skeletons for animation).
Of course the usage of an online service - no matter how good it is - has its drawbacks:
Even though services like the mentioned Tripo3D are very permissive, for example they have a diffuse ban on potentially undesirable or unwanted content, including NSFW stuff - and their monitoring triggers from time to time on totally acceptable content. For example I had generated a 3D model from the following totally morally acceptable image

Note that running locally also has drawbacks:
In this article we will look into:
At a high level Hunyuan3D follows a two stage pipeline that separates geometry generation from texture synthesis, similar to many modern 3D diffusion systems.
In the first stage a diffusion model generates the shape of the object. Depending on the configuration this is typically represented internally as a volumetric structure (for example an octree-based representation) or an implicit field, which is then converted into a polygon mesh. When using image input, the model infers missing viewpoints from a single or multiple images, effectively hallucinating the full 3D structure. Multi-view input significantly improves consistency and reduces artifacts, since the model has to rely less on learned priors.
In the second stage a separate model performs texture generation. The previously generated geometry is projected into multiple views and a diffusion model generates consistent surface textures, which are then baked back onto the mesh. This step is responsible for most of the visual quality and realism of the final asset.
The mentioned octree resolution parameter controls how finely the volumetric representation is discretized during geometry generation. Higher resolutions allow finer geometric detail but increase both memory consumption and computation time significantly. Similarly, the number of inference steps controls the quality of both geometry and texture synthesis, trading runtime for fidelity.
One important implication of this pipeline is that errors introduced in the geometry stage (for example wrong topology or missing structures) cannot be fully corrected during texturing. This is why multi-view inputs and careful parameter selection are often crucial for obtaining usable results.
Note that those models already run on pretty small consumer hardware. The minimal turbo models run with around 8-10 GB VRAM, single image inference already works with 12-16 GB VRAM and high end multi-view diffusion on 24GB or more. In addition you should have at least that much local RAM, in best case around 2x to 3x, for the high end multi-view models at least 32GB RAM would be good. The main limitation on most systems though is the VRAM, as it’s also the case for large language models
The Repository provides very good documentation on how to perform the setup. Keep in mind that with any project in the Python ecosystem it’s a good idea to install packages in your own virtual environment due to often used version pinning of dependencies, especially around diffusers and pytorch. Also I assume you already built pytorch for your platform, which is often a major hurdle to get up and running due to it’s importability to other Unices. If you are lucky and operate on FreeBSD you can use the package py-pytorch (for example for Python 3.11 this would be py311-pytorch, etc.):
pkg install py311-pytorch
pkg install py311-torchvision
If this does not work you can try to install pytorch using the supplied port:
cd /usr/ports/misc/py-pytorch
make install
cd /usr/ports/misc/py-torchvision
make install
If you want to have GPU acceleration you have to make sure that your torch build supports the respective backend.
After this step you can follow the instructions by the repository:
git clone https://github.com/Tencent-Hunyuan/Hunyuan3D-2.1.git
cd Hunyuan3D-2.1
pip install -r requirements.txt
cd hy3dpaint/custom_rasterizer
pip install -e .
cd ../..
cd hy3dpaint/DifferentiableRenderer
bash compile_mesh_painter.sh
cd ../..
Note that two additional modules that are built. Those are plugins for torch that contain native code. When using CUDA you have to make sure that nvcc is available and your CUDA_HOME is correctly set. If those two builds fail - the program runs anyways but without GPU acceleration (which usually implies painfully - unusable - slow speeds. Think in multiple hours to days per asset). If you run CPU only they don’t matter
For pure testing one can now execute the gradio based frontend:
python3.11 gradio_app.py \
--model_path tencent/Hunyuan3D-2.1 \
--subfolder hunyuan3d-dit-v2-1 \
--texgen_model_path tencent/Hunyuan3D-2.1 \
--low_vram_mode
Building on Windows is a bit more tedious. Usually I do not use Windows personally due to many reasons, still I like to add this section. Luckily for users on this system there exists another repository by Yan Wenkun. He has packed up the whole project including all required dependencies, its own local Python interpreter and its own pip instance. It is simply extracted from a 2 part 7ip archive and then executed via its RUN.bat, providing a launcher to supply the model to use and some additional parameters. Note that - for compiling the modules of the custom_rasterizer and the DifferentiableRenderer the build system has to be able to locate the nvcc version from the CUDA toolkit that matches the expected version. One can set CUDA_HOME inside RUN.bat - for example via
SET CUDA_HOME=c:\program files\NViDIA GPU Computing Toolkit\CUDA\v12.6\
In addition when one wants to use triton, which is traditionally not available for windows, one has to manually install the triton-windows package using the bundled pip, not the system wide pip.
The following shows a simple example of converting the drawing of an onion (generated via GPT Image 1 by OpenAI) into a 3D model

This image has been passed into the hunyuan3d-dit-v2-0 shape generation model and then through the Hunyuan3D-2 texture generation system (60 inference steps, Octree resolution set to 512, targeting 20000 chunks - so all settings set to the lower end). The system I’ve used was based on CUDA 12.6 and hosts a few RTX 3060 cards - this model only is capable on running on a single one of them. After around 600 seconds (10 minutes) the model was ready to be exported - which I did in GLB (GLTF) format, which is accepted by many 3D editing tools including Blender.



In addition I tried to slice and print it using the Anycubic slicer

I also used the same system to generate two 3D models for a traditional PhD hat for a colleague. The above shown graphic has been used to represent a Lattice Interferometer experiment called LATIN:

In addition I also generated a 3D reconstruction of our scanning electron microscope used for a experiment

As most tools from this category Hunyuan3D also often generates non-manifold meshes that have to be repaired before usage in
applications used for 3D printing (slicers, CAD software, etc.). Very often those meshes are directly useable in 3D engines
or utilities like Blender. In addition these models of course guess information that they cannot infer
from images - this applies especially to back sides. Also scales of models are of course not realistic - and as for diffusion
networks in two dimensions often artifacts (like additional fingers, additional feet, etc.) occur.
In my experience Tripo3D produces slightly cleaner topology and faster results, while Hunyuan3D offers full local control at the cost of setup complexity and longer runtimes.
This article is tagged:
Dipl.-Ing. Thomas Spielauer, Wien (webcomplainsQu98equt9ewh@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/