Pangyoalto - Pangyoalto Blog

Sign in Subscribe

Pangyoalto

How Do Search Engines Handle Case-Insensitive Search?

How Do Search Engines Handle Case-Insensitive Search?

What Does "Same Character" Mean? When building a search engine, you quickly realize that the concept of "same character" is far from simple. Should a search for `cafe` also find documents containing `café`? In most cases, yes. Should a search for `ＡＢＣ` (full-width characters) also return documents containing `ABC` (half-width

검색 엔진은 어떻게 대소문자 구분 없이 검색할까?

Information Retrieval

검색 엔진은 어떻게 대소문자 구분 없이 검색할까?

"같은 문자"란 무엇인가? 검색 엔진을 만들다 보면 "같은 문자"라는 개념이 생각보다 단순하지 않다는 사실을 깨닫게 된다. 사용자가 `cafe`를 검색했을 때 `café`가 포함된 문서도 찾아야 할까? 대부분의 경우 그래야 한다. `ＡＢＣ`(전각 문자)를 검색했을 때 `ABC`(반각 문자)가 포함된 문서도 나와야 할까? 당연히 그래야 한다.

Paid-members only

25년 회고

[Paper review] REFRAG: Encoding/Decoding for RAG Optimization

[Paper review] REFRAG: Encoding/Decoding for RAG Optimization

This article reviews the paper REFRAG: Rethinking RAG based Decoding. RAG is one of the first methods considered when applying LLMs to services. The key advantage of RAG is its ability to leverage domain knowledge without model fine-tuning. However, as the knowledge base grows, longer contexts must be fed as

[논문 리뷰] REFRAG: RAG 최적화를 위한 인코딩/디코딩

[논문 리뷰] REFRAG: RAG 최적화를 위한 인코딩/디코딩

이 글은 논문 REFRAG: Rethinking RAG based Decoding 을 리뷰합니다. RAG는 서비스에 LLM을 적용할 때 가장 먼저 검토하는 방법 중 하나입니다. 모델 파인튜닝 없이도 도메인 지식을 활용할 수 있다는 것이 RAG의 핵심 장점이죠. 하지만 활용할 지식이 많아질수록 입력으로 더 긴 컨텍스트를 넣어야 하고, 이는 높은 레이턴시와 메모리 소비를 발생시킵니다. 실제

[Paper review] Vector Compression Method for HNSW – Flash

[Paper review] Vector Compression Method for HNSW – Flash

This post reviews the paper Accelerating Graph Indexing for ANNS on Modern CPUs. To fully understand this paper, familiarity with HNSW and Product Quantization is helpful. HNSW is the most widely used algorithm for vector search. Since the HNSW paper was published in 2016, it has now been in use

[논문 리뷰] HNSW를 위한 벡터 압축 방식 - Flash

[논문 리뷰] HNSW를 위한 벡터 압축 방식 - Flash

이 글은 논문 Accelerating Graph Indexing for ANNS on Modern CPUs 을 리뷰합니다. 이번 논문은 HNSW 및 Product Quantization을 알고 있어야 이해하기 편합니다. 블로그에서 이미 다룬 적이 있으니, 익숙하지 않으신 분들은 아래 글을 먼저 읽고 오시는 것을 추천합니다. 벡터 검색 알고리즘 살펴보기(1): Similarity Search와 HNSW벡터 검색 알고리즘 살펴보기(2)

[Paper review] Databases in the Era of Memory-Centric Computing

[Paper review] Databases in the Era of Memory-Centric Computing

Compared to the rapid advancement of CPU performance, the growth rate of memory capacity has been steadily slowing down. Stemming from this issue, Databases in the Era of Memory-Centric Computing proposes an architecture aimed at efficiently utilizing memory, which has become increasingly expensive relative to other system components. The paper

[논문 리뷰] Databases in the Era of Memory-Centric Computing

[논문 리뷰] Databases in the Era of Memory-Centric Computing

CPU의 성능 발전 속도에 비하여 메모리의 용량 발전 속도는 계속해서 느려지고 있습니다. 이러한 문제점에서 출발한 Databases in the Era of Memory-Centric Computing은 상대적으로 점차 비싸지고 있는 메모리를 효율적으로 활용하기 위한 아키텍처를 제시하였으며 이 구조가 특히 Database에 효율적이라고 말합니다. 해당 논문은 2025년 출판되었으며 비교적 신기술인 CXL을 memory-centric computing을 구현하기 위한 핵심

[Paper review] Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations

[Paper review] Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations

The paper we'll be discussing, Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations, was one of papers which won the Best Papers Award at SIGIR 2024. This paper proposed an algorithm that performs vector searches in the inverted index solution often used in traditional search engines, rather than

[논문 리뷰] 효율적인 sparse 벡터 검색을 위한 새로운 역색인 구조

Information Retrieval

[논문 리뷰] 효율적인 sparse 벡터 검색을 위한 새로운 역색인 구조

이번에 다룰 논문인 Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations 는 SIGIR 2024에서 Best Papers에 선정된 논문 중 하나입니다. 벡터 검색에서 자주 사용되는 그래프 방식이 아닌 기존 전통적인 검색 엔진에서 자주 사용되는 역색인 방식으로 벡터 검색을 하는 알고리즘을 제안합니다. 들어가며 검색은 사용자의 질의에 가장 적합한 문서들을 가져오는

A guide to the LLM Agent Framework for the average developer

A guide to the LLM Agent Framework for the average developer

There's talk of AI and LLM everywhere these days. I think it's less and less necessary to study the LLM model or serve to be good at LLM, because there are more and more services that offer LLM APIs, and the price is dropping over time. Beyond the LLM model,

평범한 개발자들을 위한 LLM 에이전트 프레임워크 가이드

평범한 개발자들을 위한 LLM 에이전트 프레임워크 가이드

사방에서 AI와 LLM에 대한 얘기가 나오는 요즘입니다. LLM을 잘 사용하기 위해 LLM 모델을 공부하거나 서빙을 공부할 필요성은 점점 줄어들 거라 생각합니다. LLM API를 제공하는 서비스들은 점점 많아지고 있고 그 가격도 시간이 갈수록 떨어지고 있으니까요. 기업들도 LLM 모델을 공개하는 것을 넘어 AI를 활용한 서비스들을 내놓고 있습니다. AI 에이전트는 그 중 하나로

Don't trust Go GC too much - detecting memory leaks and managing GC cycles

Don't trust Go GC too much - detecting memory leaks and managing GC cycles

This article is an English translation of the post that I published on the Naver Tech Blog (D2). Noir is a data-specific search engine written in Go that works well for services that have separate data for each user, such as mail. Noir is used by NAVER Mail, a message

벡터 검색 알고리즘 살펴보기(2): HNSW, SPANN

Information Retrieval

벡터 검색 알고리즘 살펴보기(2): HNSW, SPANN

벡터 검색 알고리즘 중 HNSW와 SPANN에 대해 알아봅니다. 들어가며 이전 “Similarity Search와 HNSW”라는 글에서 Voronoi diagram, Product Quantization, HNSW에 대해 간단히 다뤄보았다. 해당 글에서 HNSW을 자세히 다루지 못하였고 다른 ANN 알고리즘과 비교를 하지 못하여 아쉬웠었다. 이번 글은 현재 가장 많이 쓰이는 벡터 검색 알고리즘 HNSW(2016)을 살펴보고 Microsoft에서

When to stop using only bloom filters: Ribbon filter

When to stop using only bloom filters: Ribbon filter

This article is an english translation of the post that I published on the medium to share paper review What kind of data structure should we use to determine which keys are in which sets? The simplest way to do this is to use a hash map that holds all

[k8s] 개발자가 꼭 알아야 할 k8s DNS, iptable 업데이트 타이밍

[k8s] 개발자가 꼭 알아야 할 k8s DNS, iptable 업데이트 타이밍

k8s 환경에서 어플리케이션을 이중화했음에도 rollout restart할 때 서비스가 일시적으로 사용 불가할 수 있습니다. 이는 k8s의 pod의 DNS, ip 업데이트시 CoreDNS와 iptable에 업데이트 시차가 존재할 수 있어 발생합니다. 이번 글에서는 pod을 statefulset으로 배포하는 경우와 deployment로 배포하는 경우를 나누고, 각각의 케이스에서 어떻게 이슈가 발생할 수 있는지 알아보겠습니다. [요약] * k8s + grpc에서 pod들을 rollout

[Go] 서버 개발자를 위한 graceful shutdown 가이드

[Go] 서버 개발자를 위한 graceful shutdown 가이드

[요약] * graceful shutdown을 적용하면 클라이언트가 타임아웃으로 인해 실패한 요청을 다른 서버로 재시도를 못하는 케이스를 막을 수 있습니다. * gRPC와 echo 프레임워크 모두 graceful shutdown 함수를 지원하여 쉽게 구현할 수 있습니다. graceful shutdown 적용 필요성 서버가 요청을 처리하다 갑자기 종료되면 처리가 거의 완료된 요청들도 정상 응답을 반환하지 못하고 에러를 반환합니다. 클라이언트는 에러가

[구글 시리즈] #3. Chubby (feat ZooKeeper)

[구글 시리즈] #3. Chubby (feat ZooKeeper)

현재 검색 시장뿐만 아니라 다양한 분야를 장악한 구글은 불과 1998년에 설립되었습니다. 구글은 매우 빠른 속도로 성장했고, 이를 위한 인프라 시스템들을 구축했습니다. 구글의 인프라 시스템은 다른 회사 및 개발자에게 많은 영향을 끼쳤습니다. 구글 시리즈는 이러한 논문들 중 일부를 뽑아 소개를 하고자 합니다. 이번 글에서는 구글의 Chubby만 다루는 것이 아니라, Zookeeper도 같이

[구글 시리즈] #2. Bigtable: A Distributed Storage System for Structured Data (2006)

[구글 시리즈] #2. Bigtable: A Distributed Storage System for Structured Data (2006)

현재 검색 시장뿐만 아니라 다양한 분야를 장악한 구글은 불과 1998년에 설립되었습니다. 구글은 매우 빠른 속도로 성장했고, 이를 위한 인프라 시스템들을 구축했습니다. 구글의 인프라 시스템은 다른 회사 및 개발자에게 많은 영향을 끼쳤습니다. 구글 시리즈는 이러한 논문들 중 일부를 뽑아 소개를 하고자 합니다. 1. The Google File System (2003) 2. BigTable: A

[구글 시리즈] #1. The Google File System(2003)

[구글 시리즈] #1. The Google File System(2003)

현재 검색 시장뿐만 아니라 다양한 분야를 장악한 구글은 불과 1998년에 설립되었습니다. 구글은 매우 빠른 속도로 성장했고, 이를 위한 인프라 시스템들을 구축했습니다. 구글의 인프라 시스템은 다른 회사 및 개발자에게 많은 영향을 끼쳤습니다. 구글 시리즈는 이러한 논문들 중 일부를 뽑아 소개를 하고자 합니다. 1. The Google File System (2003) 2. BigTable: A

AI 딥 다이브 리뷰

💡Disclaimer 이 리뷰는 한빛미디어로부터 책을 지원받아 작성하였습니다. 항상 AI에 관심이 있었지만, 많은 시간을 들여 공부할 여유는 없었다. 그래서 쉽고 빠르게 학습을 하기 위해 여러 AI 관련 책들을 살펴보곤 했었다. 대부분 일반인들을 위한 입문 책이거나 딥러닝 모델을 작성하기 위한 구현 위주의 책이었다. 큰 맥락을 짚어주는 책이 없었다. 이 책은 저자가 2015년부터

eBPF/XDP: Secure and Fast Networking You Might Not Know About

eBPF/XDP: Secure and Fast Networking You Might Not Know About

[Summary] * eBPF: Technology enabling application developers to run programs within the kernel space, allowing them to utilize OS functions at runtime. * XDP: Technology based on eBPF that supports packet processing. * XDP consists of XDP driver hooks and the eBPF virtual machine, enabling secure packet processing in the kernel space with

eBPF/XDP: 당신만 모르는 안전하고 빠른 Networking

eBPF/XDP: 당신만 모르는 안전하고 빠른 Networking

[요약] * eBPF: kernel space 내에서 프로그램을 실행할 수 있도록 하여 어플리케이션 개발자가 런타임에 OS의 기능을 사용할 수 있도록 하는 기술 * XDP: eBPF를 기반으로 한 기술로 packet processing을 할 수 있도록 지원 * XDP는 XDP driver hook 및 eBPF virtual machine 등으로 구성되어 있고 높은 성능으로 안전하게 kernel space에서 packet processing을 할

RAG의 짧은 역사 훑어보기(첫 논문부터 최근 동향까지)

RAG의 짧은 역사 훑어보기(첫 논문부터 최근 동향까지)

[요약] * RAG는 Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks(2020)에서 처음 등장하였고 모델과 Retriever을 학습시키는데 사용되었다. * 최근 RAG는 모델의 학습이 아닌 모델의 Inference 성능을 레버리지하기 위해 주로 사용되고 있다. * RAG는 문서를 어떻게 잘 검색해서 가져오고 LLM에 잘 넘겨주는지 중요하다. * RAG를 도와주는 구글의 Vertex AI는 문서 검색에 벡터 검색뿐만 아니라 전통적인