Posts

(ICML 2019) EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Image
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks EfficientNet is a highly influential paper that has gained significant attention in the field of image classification due to its outstanding performance . For projects requiring extensive training time or computational resources , EfficientNet serves as a valuable approach to enhancing ConvNet performance . It provides an efficient and scalable method for training convolutional neural networks while optimizing accuracy and computational cost, making it highly applicable for real-world AI deployment . πŸ”— Research Paper: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks πŸ“Œ Key Resources & Reviews πŸ“– Paper Review Summaries: Bellzero’s Review Laonple Blog Review πŸ’» Source Code (PyTorch Implementation): GitHub: EfficientNet-PyTorch  

(Environment) (Cairo) – (1) Installing Required Libraries to Convert SVG Files to PNG

Image
Installing Required Libraries to Convert SVG Files to PNG The goal was to apply NBP OCR to SVG images and record the output as JSON (text, coordinates) . To achieve this, the first step was to convert the ".svg" file into ".png" . Initially, I thought that simply installing cairo via pip would suffice, but I soon realized that additional steps were necessary . To avoid wasting time solving issues when I attempted it again later, I decided to document the process and share it . If you need code references or encounter any setup issues, please feel free to leave a message here: https://github.com/shipjobs/HAND2TEXT/issues . Environment Language: Python 3.8.3 64bit Operating System: Windows 10 Development Environment: Visual Studio Code To convert an ".svg" file to ".png" , if you import the libraries as shown below, you will naturally encounter a reference error : import  cairo  from  svglib.svglib  import  svg2rlg  import  cairosvg For those...

(Cloud) NCP > AI Service > OCR review

Image
CLOVA OCR (optical character reader) Service review λ¬Έμ„œλ₯Ό μΈμ‹ν•˜κ³ , μ‚¬μš©μžκ°€ μ§€μ •ν•œ μ˜μ—­μ˜ ν…μŠ€νŠΈμ™€ 데이터λ₯Ό μ •ν™•ν•˜κ²Œ μΆ”μΆœ CLOVA OCR (κ΄‘ν•™λ¬ΈμžμΈμ‹) 을 ν•œλ²ˆ μ‚¬μš©ν•΄ λ³Έλ‹€λ©΄, NCPκ°€ 제곡 쀑에 μžˆλŠ” AI Service λ“€ 에 λŒ€ν•˜μ—¬ 보닀 μ‰¬μš΄ 접근이 κ°€λŠ₯ν•΄μ§ˆ 거라 생각 ν•˜κ²Œ λ˜μ–΄ review λ₯Ό ν•΄λ³΄κ³ μž ν•©λ‹ˆλ‹€. [μ ‘κ·Ό] Products & service : https://console.ncloud.com/dashboard OCR Service 경둜 : Classic / CLOVA OCR / Domain [이용 방식] μ„œλΉ„μŠ€ νƒ€μž… General / Template / Document 선택에 따라 Text OCR / ν…œν”Œλ¦Ώ λΉŒλ” / Document λ²„νŠΌμ΄ λ…ΈμΆœλ˜λŠ” ν˜•μ‹μœΌλ‘œ μ„œλΉ„μŠ€λ₯Ό 섀정함  Text OCR (ν…μŠ€νŠΈλ§Œ μΆ”μΆœ)  κ³Ό Template λΉŒλ” ν˜•νƒœ (νŒλ… μ˜μ—­ 직접 지정을 톡해 인식 κ°’ μΆ”μΆœ ν›„ ν…ŒμŠ€νŠΈ 및 κ²°κ³Ό 전솑이 κ°€λŠ₯) λŠ” μ„œλΉ„μŠ€ νƒ€μž…μ— 따라 μ•„λž˜μ˜ 2κ°€μ§€ 방식이 있으며 1. General OCR : μš°λ¦¬κ°€ 일반적으둜 μƒκ°ν•˜λŠ” png, jpg이미지 ν˜Ήμ€ pdf 에 μ‘΄μž¬ν•˜λŠ” text 듀을 λͺ¨λ‘ 읽어 였고자 ν•˜λŠ” 방식 2. Template OCR : μš΄μ „ λ©΄ν—ˆμ¦, μ‹ μš© μΉ΄λ“œ, μ£Όλ―Ό 등둝 λ“±λ³Έ 이미지 λ“± 이미지내 μ •ν•΄μ§„ νŠΉμ • μ˜μ—­μ„ κΈ°μ€€μœΌλ‘œ text 듀을 읽어 μ˜€λŠ” 방식 Document 방식은 λ¨Έμ‹ λŸ¬λ‹ 기반으둜 λ¬Έμ„œμ˜ 의미적 ꡬ쑰λ₯Ό μ΄ν•΄ν•˜λŠ” νŠΉν™” λͺ¨λΈ 엔진을 νƒ‘μž¬ν•˜μ—¬ μž…λ ₯ 정보(key-value)λ₯Ό μžλ™ μΆ”μΆœν•˜λŠ” 방식 인식 λͺ¨λΈλ‘œλŠ” 미리 μ •ν•΄μ§„ μ‚¬μ—…μž 등둝증, μ‹ μš©μΉ΄λ“œ,영수증, 신뢄증, λͺ…함이 제곡 되며 이λ₯Ό 선택할 수 있게 λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. [μ΄λŸ¬ν•œ Type 별 μ„œλΉ„μŠ€λ₯Ό μ΄μš©ν•˜λŠ” 방법 λ˜ν•œ 2κ°€μ§€λ‘œ ꡬ뢄 될 수 μžˆμŠ΅λ‹ˆλ‹€.] 1. NCP OCR μ‚¬μ΄νŠΈμ— μ ‘μ†ν•΄μ„œ μ œκ³΅λ˜λŠ” UI ν™”λ©΄μœΌλ‘œ μ ‘κ·Ό ν•˜λŠ” λ°©μ‹μœΌλ‘œ μ›ν•˜λŠ” 이미지 νŒŒμΌμ„ drag and drop으둜 λ“±λ‘ν•˜κ³  텍...

(CVPR 2019) A Style-Based Generator Architecture for Generative Adversarial Networks

Image
  StyleGAN — Official TensorFlow Implementation Material related to our paper is available via the following links: Paper:  https://arxiv.org/abs/1812.04948 Video:  https://youtu.be/kSLJriaOumA Code:  https://github.com/NVlabs/stylegan FFHQ:  https://github.com/NVlabs/ffhq-dataset Additional material can be found on Google Drive:

(NeurIPS 2020) Dynamic allocation of limited memory resources in reinforcement learning

Image
Dynamic allocation of limited memory resources in reinforcement learning

(CVPR 2020 (Best Paper Award).) Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

Image
 Unsupervised Learning of Probably Symmetric Deformable 3D Objects  from Images in the Wild Shangzhe Wu Christian Rupprecht Andrea Vedaldi Visual Geometry Group, University of Oxford {szwu, chrisr, vedaldi}@robots.ox.ac.uk

(Research) AI Blink Detection and Reminder (1) – Project Introduction

Image
[Open Source + Research Paper + Jetson Nano] Blinker Project This project aims to implement a blink detection system using dlib and OpenCV , following an existing open-source implementation. It builds upon facial recognition techniques to develop a system capable of detecting eye blinks in real-time. According to the contributor's description , this system can be applied in various scenarios, such as: πŸš— Drowsy driving detection – Alerting drivers when signs of fatigue are detected. πŸ“š Student monitoring – Analyzing drowsiness and focus levels in study environments. By leveraging Jetson Nano , this project explores the integration of edge AI for real-time blink detection, opening possibilities for applications in safety, education, and human-computer interaction . πŸš€ asily Implementable with a Camera and Software Development Setup If you have a camera and a properly configured software development environment , this project is relatively easy to implement. In this project, we...

(Research) Face Recognition (1) – Project Introduction

Image
Let's explore the Face Recognition Project ! 😊 In this project, we will maximize the use of open-source resources with the following objectives: 1️⃣ Understand key libraries and source code 2️⃣ Set up the required development environment 3️⃣ Optimize performance for better accuracy and efficiency 4️⃣ Share results and discuss potential improvements The goal is to go beyond simple implementation , actively improving the system and exploring ways to enhance its capabilities. πŸš€

(ISSN 2249-3905) Natural Language Processing: A Review

Image
ABSTRACT  1. Introduction  2. Scope and objective  3. Previous Works On NLP (Brief History)  4. Natural Language Processing Overview  5. Applications of NLP  6. Challenges and failures  7. Current and Future progress of NLP  8. Conclusions  References 

THE BEST ARTIFICIAL INTELLIGENCE JOURNALS

Image
THE BEST ARTIFICIAL INTELLIGENCE  JOURNALS μ•ˆλ…•ν•˜μ„Έμš”?  연ꡬ ν™œλ™ 및 ν•™μœ„ μ·¨λ“μ˜ λͺ©μ , 그리고 기업에 μ†Œμ†λ˜μ–΄ λ…Όλ¬Έ λ“±μž¬λ₯Ό μ‹œμž‘ν•΄ λ³΄λ €λŠ” μ΄λ“€μ—κ²Œ 도움이 될 λ§Œν•œ λ‚΄μš©μ„ 정리해 보고자 ν•©λ‹ˆλ‹€.. 인곡지λŠ₯ 학문에 λŒ€ν•˜μ—¬ μ—°κ΅¬μž 및 μ—”μ§€λ‹ˆμ–΄λ“€μ΄ λͺ¨μ΄λŠ” ν•™νšŒλŠ” 어떀것듀이 있으며 각 ν•™νšŒμ˜ μ—­μ‚¬λ‚˜ νŠΉμ§•, 동ν–₯이 κΆκΈˆν•΄ 쑌기 λ•Œλ¬Έ μž…λ‹ˆλ‹€. μ›ν•˜λŠ” λ…Όλ¬Έμ˜ 탐색은 κ΅¬κΈ€κ²€μƒ‰μ΄λ‚˜ "http://www.arxiv-sanity.com/" 을 ν†΅ν•˜λ©΄ μ‰½κ²Œ 얻을 μˆ˜μžˆκ² μ§€λ§Œ, 각 ν•™νšŒλ³„ νŠΉμ§•μ΄λ‚˜ 동ν–₯을 μ•„λŠ” 것이 우리의 λ…Όλ¬Έ λ“±μž¬λ₯Ό ν¬ν•¨ν•œ 연ꡬ ν™œλ™μ— 도움이 될것이라 생각 ν•˜κΈ°μ— κ·Έ λ‚΄μš©μ„ 정리함과 λ™μ‹œμ— 과정을 μ§€μ†μ μœΌλ‘œ κ³΅μœ ν•΄ λ‚˜κ°€κ³ μž ν•©λ‹ˆλ‹€. 1. NeurIPS   (NIPS)     : μ—°κ΅¬μž 및 μ—”μ§€λ‹ˆμ–΄κ°€ λͺ¨μ΄λŠ” μ΅œλŒ€ 규λͺ¨μ˜ μ—°λ‘€ ν•™νšŒλ‘œ , μƒˆλ‘œμš΄ λ°œκ²¬μ„ κ³΅μœ ν•˜κ³  ν˜‘μ—…ν•˜λ©° ν•¨κ»˜ AI μ‚°μ—… λ°œμ „μ„ 도λͺ¨ν•˜λŠ” μž₯     : 역사 : 1987 λ…„ , λΆ„μ•Ό : 인지 κ³Όν•™κ³Ό λ¨Έμ‹ λŸ¬λ‹ μ‘μš© λΆ„μ•Ό λ“± 폭 λ„“μŒ     ; λ°”λ‘œκ°€κΈ°:    https://papers.nips.cc/  2. ICML : (International Conference on Machine Learning)     : λ¨Έμ‹  λŸ¬λ‹μ— 집쀑,  NeurIPS  λ°  ICLR  κ³Ό ν•¨κ»˜  기계 ν•™μŠ΅  λ°  인곡 μ§€λŠ₯  μ—°κ΅¬  에 큰 영ν–₯을 λ―ΈμΉ˜λŠ” μ„Έ          κ°€μ§€ μ£Όμš” 컨퍼런슀 쀑 ν•˜λ‚˜ , μ •ν™•ν•œ λ‚ μ§œλŠ” ν•΄λ§ˆλ‹€ λ‹€λ₯΄μ§€λ§Œ 일반적으둜 λ…Όλ¬Έ 제좜 λ§ˆκ°μΌμ€ 1 μ›” 말          이며 νšŒμ˜λŠ” 일반...