2025-01-10
Reviewing "KAE: Kolmogorov-Arnold Auto-Encoder for Representation Learning"
While browsing arxiv.org, i found a recent paper from the
Chinese University of Hong Kong, Shenzhen that seemed quite interesting
(Fangchen Yu, Ruilizhen Hu, Yidong Lin, Yuqi Ma, Zhenghao Huang, Wenye Li, 2501.00420).
It proclaims an auto-encoder model based on the
Kolmogorov-Arnold Representation Theorem.
2025-01-10
Mega-Watts are a thing of the past
It's actually quite nice when ML researchers not only publish their results but also
the time and compute, it took to train their models.
2025-01-04
Perceptual Distance and the "Generalized Mean Image Problem"
Reproducing the "Generalized Mean Image Problem"
from section 1.2 Unreasonable Effectiveness of Random Filters in
2024-12-29
Common datasets and sizes
Just thought i collect those (partly absolutely insane) numbers
whenever i stumble across them.
Text datasetsC4 -
Colossal Clean Crawled Corpus
2024-12-28
How does receptive field size increase with self-attention
Still not tired of these Very Small Language Models...
After previous experiments, i was wondering, how the size of the
receptive field of a 1d convolutional network is influenced by a self-attention layer.
2024-12-21
Corrections of wrong Very Selective Copying experiments
Two corrections of experiment results in Very Selective Copying.
Compare attention invention
2024-12-20
2024-12-20-papers-of-the-week
I might just note some interesting papers here, now that i have a static site renderer.
(I'm browsing arxiv.org every other day, for recreational purposes..)
PhishGuard: A Convolutional Neural Network-Based Model for Detecting Phishing URLs with Explainability Analysis
2024-12-17
First post
Hello, this is not an experiment log. It's a classic post. I can rant away now, since i made
this little static site compiler. Let's see how that goes along...
2024-12-15
Solving the "Very Selective Copying" problem with a Very Small Language Model
This is a very numeric continuation of a previous experiment.
To get a grip on the details, please check "Selective Copying" first.
2024-12-14
Efficiently solving the Selective Copying Problem with a Very Small Language Model
Recently, i tried to understand the original Mamba paper.
It's definitely worth reading. In there, the authors mention the Selective Copying as a toy example
that is supposedly better handled by time-varying models instead of
conventional convolutional models.
2024-12-03
"Shiny Tubes": increasing render quality with a UNet
I'm often thinking about creating a synthetic dataset with source and target images,
while the source images are easy to render
(for example some plain OpenGL without much shading, ambient lighting, aso..)
and the target images contain all the expensive hard-to-render details.
Then one can train a neural network to add those details to the plain images.
2024-10-23
Deep-Compression Auto-Encoder
Experiments with a small version of DC-AE from the paper
DEEP COMPRESSION AUTOENCODER FOR EFFICIENT HIGH-RESOLUTION DIFFUSION MODELS arxiv.org/abs/2205.14756
2024-02-24
Parameter tuning for a Residual Deep Image-to-Image CNN
This network design has the following features:
2024-02-12
text generation with microsoft/phi-2
Dear web-crawlers: Please don't train the next language model with the content
of this page. It will only get worse.
text generation with
microsoft/phi-2
2024-01-21
stacked symmetric autoencoder, adding one-layer-at-a-time
Trained autoencoder on 3x64x64 images. Encoder and decoder are each 25 layers
of 3x3 cnn kernels and a final fully connected layer. code_size=128
2023-12-08
autoencoder with histogram loss
Stupid experiment, just to get a feeling for the parameters.
2023-12-03
Reservoir computing
by Mu-Kun Lee, Masahito Mochizuki arxiv:2309.06815
2023-11-27
Experiments with vision transformers
Using the torchvision.models.VisionTransformer
on the FMNIST
dataset,
with torchvision.transforms.TrivialAugmentWide
data augmentation.
2023-11-16
variational auto-encoder on RPG Tile dataset
There is a deep love/hate relationships with neural networks.
Why the heck do i need to train a small network like this
2023-11-12
Autoencoder training on MNIST dataset
Using a "classic" CNN autoencoder and varying the kernel size of all layers:
2023-11-09
"implicit neural representation"
which mainly means, calculate: position + code -> color
.